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ABSTRACT 

Identification  of  cloud  types  through  cloud  classification  using  satellite  observations  is  yet  to  produce  consistent 
and  dependable  results.  Cloud  types  are  too  varied  in  their  geophysical  parameters,  as  measured  by  satellite 
remote  sensing  instruments,  to  provide  for  a  direct  accurate  classification.  To  aid  in  classification,  texture 
measures  are  additionally  employed.  These  measures  characterize  local  spectral  variations  in  images.  They  are 
widely  used  for  image  segmentation,  classification,  and  edge  deux  non.  Numerous  methods  have  been  developed 
to  extract  textural  features  from  an  image  on  the  basis  of  spatial  ana  spectral  properties  of  the  image.  In  our  effort, 
several  of  these  methods  are  sfdied  for  their  applicability  in  cloud  classification  and  cloud  feature  identification. 
The  examined  texture  methois  include  a)  spatial  gray-level  co-occurrence  matrices,  b)  gray-level  difference  vector 
method,  and  c)  a  class  of  filters  known  as  Gabor  transforms.  Methods  a)  and  b)  are  spatial  and  staustical  while 
method  c)  is  in  the  frequency  domain.  A  series  of  comparative  tests  have  been  performed  applying  these  methods 
to  NOAA-AVHRR  satellite  data.  A  discussion  as  to  the  suitability  of  these  texture  methods  for  cloud  classification 
concludes  this  study. 


Key  Words:  texture  analysis,  cloud  classification,  Gabor  transforms,  spatial  gray-level  co-occurrence  matrices, 
gray-level  difference  vector  (GLDV),  NOAA-AVHRR. 


INTRODUCTION 

Identification  of  cloud  types  by  automated  cloud  classifiers, 
which  operate  on  a  pixel  by  pixel  basis,  has  yet  to  show 
dependable  and  accurate  results.  Clouds  have  geophysical 
parameters  which  are  too  inconsistent,  as  measured  by  satellite 
remote  sensing  instruments,  to  provide  for  a  direct  accurate 
classification.  No  method  developed  to  date  provides  a 
reliable  spectral  signature  which  would  uniquely  identify  a 
specific  cloud  type  anywhere  on  the  Earth  globe  during  any 
season.  Cloud  types  vary  in  their  spectral  response  at  different 
latitudinal  locations  and  at  different  times  of  the  year.  These 
variations  complicate  methods  required  for  cloud  type 
identification  using  remote  sensing  techniques. 

Surveying  the  various  available  statistical,  structural,  and 
frequency  domain  techniques  applied  to  cloud  classification,  it 
appears  that  there  are  not  enough  parametrizaiion  vectors  to 
uniquely  separate  any  one  cloud  type.  For  this  reason,  texture 
analysis  methods  are  drawn  upon  in  addition  to  aid  in  this 
problem.  The  use  of  texture  parameters  has  been  reported  on 
extensively  in  recent  literature  (Wechsler,  1980).  Texture 
techniques  used  in  our  study  include  a)  spatial  gray-level  co¬ 
occurrence  matrices,  b)  gray-level  difference  vector  (GLDV) 
method,  and  c)  a  class  of  filters  known  as  Gabor  transforms. 
Each  of  these  approaches  has  unique  merit  for  providing 
additional  inforraauon  about  cloud  masses  within  a  scene. 
These  unique  differences  are  the  focus  in  this  study. 

Images  in  this  case  study  are  composites  of  Advanced  Very 
High  Resolution  Rad'^meter  (AVHRR)  channel  one  and 
channel  four.  Pixel  by  pixel  classifications  of  cloud  types, 
based  on  the  spectral  and  spatial  responses  from  these 
channels,  are  enhanced  with  results  from  the  various  texture 
analysis  algorithms.  Results  of  the  classifications  from  the 
combined  techniques  are  compared  and  discussed. 


DATA 

An  image  from  the  Gulf  of  Alaska  was  chosen  for  this  work. 
This  region  was  selected  due  to  its  high  latitude  which  presents 
challenging  solar  zenith  angles.  It  also  provides  snow  within 
the  scene  which  tests  snow  and  cloud  separation  capabilities  of 
the  candidate  methods.  Furthermore,  the  general 
meteorological  activity  within  this  region  is  high  thereby 
presenting  a  continuous  varying  source  of  frontal  cloud 
masses. 

The  scene  selected  for  presentation  is  one  of  eight  images  used 
in  this  study.  It  is  an  AVHRR  image  from  15  October  1988, 
19Z.  A  full  resolution  (1.1  km  per  pixel)  sector  of  1024  by 
1024  ten-bit  pixels  was  extracted  from  the  original  2048  by 
2048  data  set. 

The  channel  one  and  channel  four  radiance  images  are  shown 
in  Figures  1  and  2.  The  channel  one  image  is  histogram- 
equalized  for  purposes  of  display.  The  channel  four  image  is 
inverted  so  as  to  represent  clouds  in  lighter  gray  shades. 

The  large  band  of  clouds  in  the  extreme  right  of  the  image  is  a 
frontal  cloud  mass  that  has  previously  moved  through  the  area. 
This  cloud  mass  is  characterized  by  high  thick  cirrus  over 
cumulus.  These  clouds  are  brightened  by  their  height  as  well 
as  by  the  low  sun  angle  which  is  characteristic  for  this 
northern  latitude. 

In  the  lower  central  portion  of  the  image  are  well  defined  cloud 
streets.  They  are  trailed  by  open  cell  stratocumulus  and 
altostratus  that  extend  to  the  left  center  of  the  image.  The 
mixed  layered  cloud  mass  in  the  lower  left  portion  of  the  image 
represents  stratus  and  altostratus  with  a  cover  of  thick  cirrus. 
Some  closed  cell  stratocumulus  are  at  the  bottom  of  the  image 
between  the  stratus  and  frontal  clouds. 
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Statistical  Methods 


Snow  can  he  seen  in  the  upper  central  ponton  of  the  image. 
Typically,  snow  will  be  observed  to  have  a  dendritic-like 
structure  w'hich  distinguishes  it  from  cloud  masses. 


TEXTURAL  METHODS 

Texture  is  a  term  used  to  characterize  the  surface  of  a  given 
object.  It  can  also  be  applied  to  an  image  of  a  phenomenon.  It 
is  undoubtedly  one  of  the  main  features  drawn  upon  in  image 
processing  and  pattern  recognition.  Texture  analysis  plays  a 
fundamental  role  in  classifying  objects  and  outlining 
significant  regions  of  a  given  gray  level  image  (Wechsler, 
1980).  Despite  its  ubiquity  in  image  data,  though,  texture 
lacks  a  precise  definition.  Some  definitions  characterize 
texture  as  visual  images  which  possess  some  stochastic 
structure.  Other  definitions  describe  texture  as  an  attribute 
generated  by  a  local  periodic  pattern.  Whatever  the  definition, 
most  algorithms  which  derive  texture  from  an  image  fall  into 
the  categories  of  either  statistical  or  frequency  domain.  A  brief 
description  of  the  three  texture  methods  of  interest  follows. 


Figure  1.  AVHRR  channel  1  ten-bit  radiance 
values,  histogram  equalized  for  display. 


Figure  2.  AVHRR  channel  4  ten-bit  radiance 
values,  inverted  lor  display. 


The  two  most  commonly  used  statistical  texture  methods  are 
the  a)  gray-level  difference  vector  (OLD Vj  method  (Welch  ei 
ul.,  1990,  Khazeme  and  Richardson.  1991).  and  ihe  h)  co¬ 
occurrence  matrix  method  (Haralick.  1973).  Our  current  study 
draws  upon  both  of  these  methods.  Both  methods  extract  a  set 
of  statistical  parameters  from  a  given  image.  Some  of  the 
commonly  extracted  texture  parameiers  are  inertia,  correlation, 
homogeneity,  entropy,  energy,  variance,  skewness,  and 
kurtosis.  These  parameters  are  then  used  as  die  input  features 
to  a  classifier. 

Texture  measures  are  derived  commonly  from  statistical 
parameiers  of  first  or  second  order  The  GLDV  method 
estimates  the  probability  density  function  for  differences  taken 
between  image  function  values  at  locations  spaced  d  pixels 
apart  3nd  at  an  angle  8  The  resulting  texture  measures  are 
based  on  this  first  order  statistic.  The  spatial  co-occurrence 
matrix  method,  on  the  other  hand,  estimates  the  joint  gray  level 
distribution  for  two  gray  levels  located  at  a  dtstance  d  and  at  an 
angle  8.  The  texture  measures  derived  hv  the  co-oceurrem  - 
matriA  ..ictnod  are  based  on  this  second  order  statistic. 

The  co-occurrence  matrix  method  is  used  in  this  study  to 
derive  texture  values  of  entropy,  homogeneity,  energy  (similar 
io  the  GLDV  angular  second  moment),  and  correlation  These 
four  parameters  were  calculated  for  the  radiances  of  each  of  the 
two  channels.  AVHRR  channel  one  and  channel  four,  for  a 
total  of  eight  texture  values.  Each  texture  value  was  processed 
using  three  different  convolution  sizes.  The  n  by  n 
convolution  sizes  are  n  =  3.  9.  and  16  In  addition,  the 
search  angle  for  each  of  the  convolutions  was  varied  to 
determine  whether  or  not  the  derived  textures  possess  any 

angular  dependence.  The  angle  was  set  to  8  =  0.”  45,°  90.“ 

and  135.°  Search  angle  dependence  is  expected  only  when 
the  surface  resolution  is  much  smaller  than  the  1.1  km  surface 
resolution  of  the  AVHRR  instrument  and  indeed,  as  discussed 
later,  no  angular  dependence  was  identified  using  the  co¬ 
occurrence  matrix  method  and  the  given  image  data. 

The  GLDV  technique  was  similarly  applied.  The  same  texture 
values  as  for  the  co-occurrence  matrix  method  were  calculated 
The  calculations  were  performed  on  the  same  channel  one  and 
channel  four  radiance  values,  but  only  for  a  single  search 
angle,  6  =  0.  The  search  angle  non-dependcnce  had  already 
been  established  from  working  with  the  co-occurrence  matrix 
method.  Seven  convolution  sizes  were  chosen  to  derive  the 
texture  values  of  entropy,  local  homogeneity,  and  angular 
second  moment.  The  convolution  sizes  were  n  =  3,  5,  9. 
11,  16,  32,  and  64.  From  a  previous  study  (Khazcnie  and 
Richardson,  1991)  the  three  sizes  of  n  =  3.  16.  and  64 
provided  the  best  statistical  representation  of  the  data  for  use  in 
cloud  classification.  This  finding  was  re-established  in  the 
current  work. 


Frequency  Domain  Methods 

Spatial  granularity  and  repetitiveness  is  one  of  the 
characteristic  aspects  of  texture.  Both  can  be  quantified  by 
looking  at  the  frequency  content  of  an  image.  It  is  therefore 
reasonable  to  expect  that  transform  techniques  arc  suitable  for 
extracting  texture  information  from  images. 

The  Fourier  transform  analysis  method  (Lendans  ei  ai..  19701 
is  a  procedure  which  works  in  the  frequency  domain  ll  is.  by 
far.  the  most  used  transform  method  Image  features,  such  as 
spectral  rings  or  edges,  arc  derived  from  the  image  power 
spectrum  by  this  technique. 

Related  to  the  Fourier  transform  are  functions  first  introduced 
by  Gahor  (Gahor.  1946).  These  functions  have  been  extended 
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to  two  dimensions  (Daugman.  1980)  resulting  in  what  is 
known  as  the  two-dimensional  (2-D)  Gabor  filters. 


Co-occurrence  Matrix 


One  of  the  unique  properties  of  Gabor  filters  is  their  ability  to 
discriminate  textural  features  in  a  way  similar  to  that  of  human 
vision  (Fogel  et  al.,  1989).  Another  important  property  is 
their  achievement  of  the  theoretical  lower  bound  of  joint 
uncertainty  in  the  two  dimensions  of  visual  space  and  spatial 
frequency  variables  (Bovik  et  al..  1990),  Additional 
•advantages  of  Gabor  transforms  include  their  tunable  spatial 
orientation,  radial  frequency  bandwidths,  and  tunable  center 
frequencies. 

The  2-D  Gabor  filter  is  a  harmonic  oscillator,  a  sinusoidal 
plane  wave  within  a  Gaussian  envelope.  The  convolution 
version  of  the  complex  2-D  Gabor  function  has  the  following 
general  form. 


G(x,  y  I  W,  Q,  <p,  X,  Y)  = 

•[( x-X )2  +  (y-Y)2] 


2a2 

sia(W(xcosO  -  ysind)+q>) 


(D 


In  equation  (1),  the  Gaussian  width  is  a „  the  filter  orientation 

is  9.  the  frequency  is  W,  and  the  phase  shift  is  <p.  Variables 
X  and  Y  define  the  center  of  the  filter. 

The  Gabor  function,  equation  (1),  can  be  represented  as  a 
complex  function  having  a  real  and  an  imaginary  component, 
Gj  and  G2,  respectively. 

G,(x,  y  I  W,  6,  <p  =  0,  X,  Y) 


Gj(x,  y  I  W,  0,  tp  =  f,  X,  Y) 

Functions  Gj  and  G2  are.  respectively,  even  and  odd 
symmetric  along  the  preferred  orientation  direction  6.  The 
results  of  convoluting  Gj  and  G2  with  any  two-dimensional 
function  are  identical  except  for  a  spectral  shift  of  kJ2  along 
the  direction  9, 

Given  an  image  I(x,  y),  its  Gabor  transformation  for  a  given 
filter  size  n  with  orientation  angle  9  and  frequency  W  is  given 
by  the  following  equation. 


The  texture  results  from  the  co-occurrence  matrix  algorithm 
were  first  classified  alone  for  each  of  the  convolution  sizes  and 
search  angles.  Figure  3  is  the  result  of  this  classification  for 
an  n  by  n  convolution  size  where  n  =  3  and  for  a  search  angie 

9  =  45. 0  Th.s  represents  the  best  result  for  all  of  the 
classifications  from  the  co-occurrence  matrix  algorithm. 


Figure  3.  Texture  values  from  the  co-occurrence 
matrix  algorithm,  classified  and  scaled 
up  for  display. 

It  is  clear  from  Figure  3  that  the  texture  values  alone  do  not 
represent  cloud  types  with  any  accuracy.  Figure  3  shows  nine 
classes,  but  none  identify  any  of  the  cloud  types  uniquely 
The  classified  images  are  extremely  noisy  and  at  best  represent 
features  within  the  cloud  masses  rather  than  the  cloud  types 
themselves.  There  is.  however,  one  reasonably  accurate 
feature  which  resulted  from  this  classification.  For  the  lowest 
convolution  size,  at  all  search  angles,  the  clear  vs.  cloudy 
areas  are  quite  distinct. 

For  convolution  sizes  of  n  =  9  and  16.  for  all  search  angles, 
the  classification  separates  the  clear  areas  from  the  cloudy  ones 
with  success  as  well.  It  also  produces  a  smoother 
classification.  The  cloud  types,  though,  are  still  difficult  to 
identify. 


S2(X,Y  I  =  (G,  *  I(x,y)l2  +  [C2  •  l(x,y)]2 

The  Gabor  filter  described  by  equation  ( 1)  was  applied  to  the 
AVHRR  test  images'  channel  one  and  channel  four  radiances. 

The  response  was  evaluated  for  filters  with  9  =  0.°  45.° 

90.°  and  135  °  The  frequency  W  was  set  to  2ref/(n/2) 
where  f  =  0.5.  0.6.  0.7.  0.8,  0.9,  and  1.0.  The  tested 
filter  sizes  n  were  9,  17,  33,  and  65  <  The  best  results  for 
cloud  typing  from  the  four  convolutions  was  for  n  =  17. 


RESULTS 

The  1024  by  1024  ten-bit  radiance  data  from  channels  one 
and  four  was  used  as  input  to  the  various  texture  algorithms. 
The  texture  output  was  then  resized  back  to  the  full  1024  by 
1024  resolution  and  added,  as  supplementary  channels,  to  the 
radiance  data.  The  resulting  N  channel  data  set  was  classified 
using  a  standard  statistical  unsupervised  classifier. 


The  classifications  were  performed  twice  on  the  texture  results 
where  n  =  9  and  16.  The  first  classification  was  performed 
on  all  of  the  texture  values  derived.  The  second  classification 
was  performed  on  the  same  values  except  for  the  correlation 
parameter.  The  results  of  the  cloud  type  classification  neither 
improved  nor  degraded.  Therefore  it  seems  that  the  correlation 
parameter  does  not  contribute  to  the  information  needed  for 
cloud  typing. 

In  ail  cases  of  classifying  only  the  co-occurrence  matrix  texiure 
results,  the  thick  cirrus  was  identifiable  as  a  homogeneous 
feature,  yet  it  was  assigned  the  same  class  as  portions  of  the 
open  cell  stratocumulus.  Also  in  all  cases,  the  snow  was  not 
separated  from  the  clouds  The  snow  was  assigned  the  same 
class  as  the  stratus  and  altostratus  clouds. 

It  was  concluded  at  this  point  that  classifying  texture  values 
alone  does  not  provide  sufficient  results  for  identifying  cloud 
types.  The  next  step  then  was  to  provide  more  information  to 
the  classifier.  The  eight  texture  values  were  combined  with  the 
two  AVHRR  channel  radiances  (channel  one  and  four!  and 
classification  was  performed  on  the  resulting  ten  channels  of 
data. 


The  texture  values  for  convolution  size  of  n  =  9  were  resized 
to  the  full  1024  by  1024  resolution,  equal  to  that  of  the 
channel  radiances,  merged  with  the  channel  radiances,  and  the 
resulting  data  set  was  classified.  The  output  is  shown  in 
Figure  4. 

The  search  angle  was  varied  as  before,  but  the  results  from  the 
classifier  showed  virtually  no  differences  for  unequal  angles. 
The  results  for  varied  search  'ingles  were  compared  by 
calculating  difference  images.  Only  minor  variations  were 
noted  in  some  of  the  mixed  layer  cloud  types  amounting  for 
less  than  1  %  difference  over  the  entire  image.  From  this  it 
was  concluded  that  the  process  is  not  dependent  on  search 

angle  and  all  further  comparisons  were  made  setting  9  =  0  ° 


Figure  4.  Combination  of  AVHRR  channel  1, 
channel  4,  and  texture  values  from  the 
co-occurrence  matrix  algorithm,  clas¬ 
sified. 

The  frontal  cloud  mass  at  the  extreme  right  of  the  image  in 
Figure  4  is  represented  by  four  distinct  classes.  The  thick 
cirrus,  the  altocumulus,  the  cumulus,  and  the  lower  level 
stratocumulus  each  appear  as  distinct  cloud  types.  They  are 
affected  by  the  sun  angle  thereby  giving  the  cirrus  over  the 
frontal  cloud  mass  a  different  class  than  the  cirrus  over  the 
stratus  in  the  lower  left  portion  of  the  image. 

The  classes  representing  the  stratus  clouds  provide  more 
separation  of  cloud  types  than  a  human  pholointerpcrter  would 
give.  Should  the  goal  be  to  duplicate  human  performance,  one 
can  easily  combine  some  of  the  statistical  classes.  However, 
our  goal  was  to  obtain  parameter  vectors  for  performing 
unsupervised  cloud  classifications,  no  matter  how  many 
vectors  there  may  be,  as  long  as  the  distinct  cloud  types  can  be 
separated  from  each  another.  That  goal  was  achieved. 


Gray  Level  Difference  Vector 

The  texture  results  from  the  gray  level  difference  vector 
(GLDV)  algorithm  were  first  classified  alone,  identically  as  for 
the  co-occurrence  matrix.  Similarly,  the  classification  results 
from  these  texture  values  alone  do  not  provide  cloud  type 
information  directly.  The  results  arc  essentially  identical  to 
those  shown  in  Figure  3  for  the  co-occurrence  matrix.  The 
textures  values,  when  classified,  identify  edges  between 
features  within  the  image  well,  but  the  features  are  various 
ureas  within  the  cloud  type  rather  than  the  cloud  type  itself. 

As  with  the  co-occurrence  matrix  method,  the  GLDV  performs 
very  well  at  identifying  cloud  versus  no  cloud  areas  within  the 
scene.  It  is  not  know  at  this  lime,  however,  if  this  capability 


can  be  extended  easily  to  all  AVHRR  images.  With  more 
study  this  may  indeed  prove  to  he  the  case  Simple 
thresholding  of  the  texture  values  may  be  all  that  is  required 
Results  from  a  previous  study  (Khazenie  and  Richardson, 
1991)  support  this  conjecture. 

As  with  the  co-occurrence  matrix  output,  the  textures  derived 
hy  the  GLDV  were  then  combined  with  the  channel  one  and 
channel  four  radiances.  The  resulting  eight  channels  were 
classified  and  the  outcome  is  shown  in  Figure  5  Again,  the 
results  are  essentially  identical  to  those  from  the  co-occunvnce 
matrix  method  (Figure  4)  in  their  ability  to  type  clouds  Sun 
angle  remains  a  problem  within  the  frontal  cloud  region  at  the 
extreme  right  of  the  image.  However,  the  mixed  layer  clouds 
in  the  lower  left  show  the  same  successful  level  of  cloud  type 
separation  as  with  the  co-occurrence  matrix  method. 


Figure  5.  Combination  of  AVHRR  channel  1, 
channel  4,  and  texture  values  from  the 
GLDV  algorithm,  classified. 


Convolution  size  plays  a  role  in  the  ability  to  type  clouds.  For 
n  >  16  the  algorithm  is  able  to  identify  the  presence  of 
clouds.  It  is  also  able  to  determine  that  the  texture  in  the 
region  is  unique.  However,  it  does  not  provide  enough 
information  to  the  classifier  to  separate  cloud  types.  Although 
die  statistical  significance  is  in  favor  of  the  higher  convolution 
sizes,  it  is  the  lower  convolution  sizes  that  provide  the  textural 
signTicance  to  the  classifier  for  cloud  type  identification. 


Gabor  Filters 

Figure  6  presents  the  result  of  classifying  the  test  image 
channel  one  and  channel  four  radiances  combined  with  the 
Gabor  filter  output  where  n  =  17  and  phase  angle  <p  =  0  Of 
the  available  2048  by  2048  data,  the  same  1024  by  1024 
scene  was  originally  acquired  as  for  the  statistical  methods. 
However,  computer  resources  available  for  the  study  of  Gabor 
filters  could  digest  no  more  than  512  by  512  images. 
Therefore,  only  the  lower  left  quarter  of  each  1024  hy  1024 
scene  was  analyzed.  One  such  quarter  is  shown  in  Figure  6. 

The  thick  cirrus  over  the  stratus  is  well  separated.  This  is  a 
great  improvement  over  the  classification  of  texture  values 
from  the  Gabor  filter  alone.  Indeed  the  classifications  of  the 
combined  image,  radiances  and  textures  shown  in  Figure  6, 
are  much  easier  to  label  than  are  either  of  the  classifications 
based  on  texture  only. 

Convolution  sizes  n  >  17  do  not  perform  well  for  a  wide 
variety  of  cloud  types  within  a  scene.  This  follow  s  along  w  ith 
the  same  findings  as  for  the  statistical  textural  methods 
Important  textural  attributes  in  the  cloud  mass  are  lost  when 
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higher  convolution  sizes  are  used.  With  such  convolution 
sizes,  the  textural  analysis  shows  the  dominant  texture  over 
each  cloud  mass,  but  does  not  sufficiently  indicate 
characteristic  features  of  that  cloud  mass  thereby  missing  the 
classification  of  the  cloud  type.  In  particular,  the  open  cel! 
stratocumulus  and  altostratus  within  the  image  did  noi  separate 
out  at  the  higher  convolution  sizes. 


Figure  6.  Combination  of  AVHRR  channel  1. 

channel  4,  and  texture  values  from  the 
Gabor  filter,  classified.  The  results 
shown  cover  only  the  lower  left  quarter 
of  the  test  image. 

While  it  has  not  been  tested  in  this  study  directly,  it  is 
conjectured  that  the  approach  using  the  Gabor  filter  will  be  less 
sensitive  to  latitude  variances  and  seasonal  changes  compared 
to  the  two  statistical  techniques.  The  Gabor  filter  is  a  tunable 
algorithm.  With  proper  adjustment  of  control  parameters  it 
should  be  possible  to  desensitize  the  filter  to  local  effects,  such 
as  latitude  changes  and  seasonal  effects,  while  retaining  the 
ability  to  extract  the  required  physical  response  which  uniquely 
represents  each  cloud  tyjiC. 


Inter-comparisons 

The  results  from  classifying  only  the  output  of  the  two 
statistical  textural  methods,  the  co-occurrence  matrix  and  the 
GLDV,  are  almost  exactly  alike.  This  is  reasonable  since  the 
texture  measures  calculated  by  both  of  these  algori'hms  were 
the  same.  One  should  expect  the  same  results  even  though 
they  were  arrived  at  by  different  means.  The  GLDV  is  based 
on  first  order  statistics  while  the  co-occurrence  matrix  method 
is  based  on  second  order  statistics.  It  can  therefore  be 
concluded  that,  given  our  test  images,  the  extra  complexity  of 
the  second  order  statistics  is  not  necessary  for  arriving  at 
satisfactory  results. 

It  is  also  noteworthy  that  the  results  from  ihe  two  statistical 
methods  are  virtually  identical  even  though  different 
convolution  sizes  were  used.  This  suggests  that  the  features, 
which  distinguish  cloud  types,  are  fairly  coarse.  It  also 
suggests  a  lack  of  fine  features  which  would  distract  a  method 
that  uses  a  small  convolution  size. 

The  output  of  the  Gabor  filler  has  different  characteristics 
compared  to  the  output  of  the  co-occurrence  matrix  and 
GLDV,  yet  the  ability  to  separate  cloud  types  is  very  similar 
for  all  three  methods.  The  resolution  of  the  Gabor  filter  output 
is  lower,  seventeen  pixels  versus  nine  for  the  co-occurrence 
matrix  and  three  for  the  GLDV  The  classification  results  are 
greatly  influenced  by  this  resolution  difference.  This  is 
obvious  by  comparing  the  pixel  size  tn  Figure  6  with  Figure  4. 


In  Figure  6  the  boundary  of  the  thick  cirrus  over  the  stratus 
has  been  smoothed  This  is  also  true  for  ihe  stratus  cloud 
types.  The  separation  of  multilayered  clouds  is  similar  for  all 
three  methods. 

The  computer  processing  time  required  for  the  Gabor  titter 
method  proved  much  less  than  either  of  ihe  iwo  statistical 
methods  (co-occurrence  matrix  or  GLDV),  Processing  a  full 
1024  by  1024  scene  using  the  Gabor  filter  look 
approximately  one  minute  on  a  SUN  SparcStation  11.  The 
statistical  methods  required  approximately  ten  minutes  each  for 
the  same  image. 


CONCLUSION 

Classification  of  cloud  types  using  spectral  and  derived 
textural  parameter  vectors  alone  has  not  been  completely 
successful.  Additional  information  about  texture  in  the  image 
provides  more  input  to  a  cloud  classifier.  Such  an  addition 
shows  considerable  improvement  over  cloud  classification 
based  only  on  spectral  information.  Despite  the  marked 
improvement,  however,  it  does  not  yet  appear  that  the  addiuon 
of  texture  information  provides  all  of  the  necessary  parameters 
required  to  successfully  classify  and  completely  label  cloud 
types.  Nevertheless,  results  from  this  study  indicate  that 
Gabor  filters  applied  to  the  spectral  data  set.  and  used  in 
conjunction  with  the  spectral  data  for  classification,  extract 
cloud  types  better  and  faster  than  the  chcr  techniques 
explored. 
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Work  has  also  continued  into  special  case  GP  algorithms:  Integer,  Zero-One, 
Fuzzy,  Interactive  and  Chance-Constrained.  A  breakdown  of  publications  in 
these  areas  is  given  in  Romero  [37].  In  total  he  lists  355  papers  dealing  with 
GP  applications  in  26  distinct  areas. 

Research  has  been  done  to  apply  other  Multi-Criteria  and  Management  Sci¬ 
ence  techniques  to  Goal  Programming.  These  include  interactive  multi-criteria 
methods  [38],  ‘Delphi’  techniques  [39,  40],  Saaty’s  [41]  analytical  hierarchy  ap¬ 
proach  [36,  23,  39],  and  resource  planning  and  management  systems(RPMS) 
networks  [42].  Recently  papers  have  been  published  dealing  with  some  of  the 
perceived  ‘errors’  in  G.P  [37,  40,  43],  and  explaining  how  these  can  be  avoided 
by  the  correct  setting  of  weights,  goals,  priority  levels  etc. 

The  remainder  of  the  paper  will  be  divided  into  four  sections.  Section  2 
will  deal  with  lexicographic(pre-emptive)  GP,  section  3  with  weighted  GPfnon 
pre-emptive),  section  4  with  the  connection  between  utility  functions  and  GP. 
finally  section  5  will  draw  conclusions  as  to  the  current  direction  of  GP  and  the 
direction  of  the  authors'  future  research. 

2  Lexicographic  GP 

Of  the  355  papers  mentioned  by  Romero  [37],  226  use  the  concept  of  Lexico¬ 
graphic  GP(LGP),  which  requires  the  pre-emptive  ordering  of  priority  levels. 
The  standard  LGP  model  can  be  algebraically  represented  as: 

Lex  min  a  =  (</i(n,  p),  g2(n,  p), . ,ff*(n,p)) 

subject  to, 

/,(x)  +  n,  -  pi  =  bi  i  =  1, . ,  m 

This  model  has  K  priority  levels,  and  m  objectives,  a  is  an  ordered  vector  of 
these  K  priority  levels. 

A  standard  ‘g’  (within  priority  level)  function  is  given  by: 

9k( n,  p)  =  ot.n,  4- . -f  a *mnm  +  0k,pl  + . +  0kmpm 

This  paper  will  summarize  the  development  of  algorithms  to  solve  the  LGP 
model,  work  on  the  multi-dimensional  dual  [30,  44],  and  current  thinking  on 
methods  of  priority  ranking  and  weighting  within  the  priority  levels,  borne 
applications  of  LGP  will  be  commented  on,  in  an  effort  to  outline  which  types 
of  problem  are  suitable  for  an  LGP  approach,  and  which  are  better  solved  using 
other  techniques. 

3  Weighted  GP 

Weighted  (or  non-pre-emptive)  GP(WGP)  requires  no  pre-emptive  ordering  of 
the  objective  functions.  Instead  all  the  different  deviations  are  placed  in  a  single 
priority  level  objective  with  different  weights  to  represent  their  importance. 
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Algebraically,  a  WGP  has  the  following  structure: 

A/ in  a  =  ]P(orjn,  +  0,p.) 

1  =  1 

Subject  to, 

/.(x)  +  n.  ~P«  =  i'=l, . m 

x  e  c, 

Where  C,  is  an  optional  constraint  set.  Of  interest  here  are  the  problems 
caused  by  incommensurability,  i.e.  objective  functions  being  measured  in  differ¬ 
ent  units,  and  techniques  used  to  overcome  this.  As  in  the  LGP  case,  application 
areas  will  be  outlined. 


4  Utility  Functions 

The  third  section  will  deal  with  the  connections  between  utility  functions  and 
the  different  types  of  GP.  It  will  explore  the  literature  on  the  problems  caused  in 
reconciling  LGP  and  utility  function  theory.  It  will  also  examine  recently  devel¬ 
oped  techniques  to  model  GP’s  more  closely  around  their  underlying  objective 
functions  [45], 

5  Summary  and  Conclusions 

The  final  section  will  draw  conculsions  as  to  the  scope  and  limitations  of  GP 
and  highlight  areas  in  which  the  authors  intend  to  conduct  further  research. 
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The  Economic  Lot  Scheduling  Problem  (EISP)  Is  to  economically 
schedule  lots  of  one  or  more  products  on  a  single  machine. 
Demand  Is  constant,  backlogging  Is  not  oiiowed  and  the  planning 
horizon  is  infinite.  The  problem  is  to  minimize  total  operating  cost  per 
unit  time  which  is  comprised  of  setup  costs  and  Inventory  costs. 
Setup  costs  are  incurred  whenever  a  production  for  a  lot  Is  begun 
and  inventory  carrying  costs  con  be  defined  as  the  time  value  of 
money  tied  up  In  inventory. 

An  extension  to  single  machlne/faciiity  problem  is  ihe  STudy  of 
environments  where  products  are  manufactured  through  several 
operations.  Such  systems  are,  in  general,  called  as  multi-stage 
production  systems.  Multi-stage  production  systems  received  a  lot  of 
academic  attention  in  recent  years  focussing  on  the  control  of  work- 
in-process  Inventory  and  Its  functional  relationship  to  the 
manufacturing  cycle  time.  It  Is  a  very  well  known  fact  by  now,  the 
larger  the  production  lot  size,  the  longer  the  manufacturing  cycle, 
which  in  turn.  Increases  the  work-lrvprocess  inventory.  There  exists  a 
vast  literature  modelling  this  relationship  to  varying  degrees  in 
different  models  for  different  system  configurations. 
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The  Multi-stage  Economic  Lot  Scheduling  Problem  (MS-ELSP)  brings 
together  two  important  problem  characteristics  inherent  to  muttMtem 
and  multi-stage  problems,  in  a  multi-item  problem,  the  mam  Issue  Is 
that  of  creating  schedules  which  avoids  the  interference  that  is  ilkeiy 
to  occur  when  two  or  more  products  compete  for  the  some  facility. 
We  will  refer  to  this  as  the  ‘feasibility  Issue*,  in  a  multi-stage 
environment,  the  production  should  be  synchronized  so  thaT 
concurrent  production  of  the  same  iot  Is  not  possible  In  the 
consecutive  stages.  This  characteristic  leads  to  the  definition  ot  work 
in-process  inventory  which.  In  fact.  Is  a  tool  for  the  synchronization  or 
production  among  stages.  Thus,  in  multi-stage  problems,  creating 
schedules  owing  this  property  will  be  referred  to  as  ‘consistency 
Issue".  This  study  addresses  the  Multi-stage  Economic  Lot  Scheduling 
Problem  with  the  objective  of  determining  feasible  and  consistent 
schedules  which  result  from  the  conventional  tradeoff  between  setup 
costs  and  Inventory  holding  costs  comprising  the  total  cost  of  c 
schedule. 

In  this  research,  we  restrict  the  study  of  MS-ELSP  to  serial  systems 
where  there  are  m  products  to  be  manufactured  through  n  distinct 
stages.  We  first  analyze  the  two  product  -  two  stage  problem.  In 
order  to  guarantee  feasibility,  common  cycle  solutions  In  which  the 
possible  values  of  cycle  times  for  all  Items  are  constrained  to  a  single 
cycle  time  value.  T.  are  sought  for.  In  a  two-stage  production 
system,  production  of  a  lot  on  the  second  stage  cannot  begin  untii 
its  production  on  the  first  stage  is  completed.  Therefore,  production 
between  stages  should  be  synchronized  so  that  we  end  up  with 
consistent  schedules.  To  ensure  consistency,  we  define  a  constraint 
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for  eoch  product  which  also  provide  information  about  the  work-in- 
process  Inventories. 

Another  important  point  In  this  study  Is  the  presence  of  nonrvegcrtk/e 
setup  times.  Setup  times  mean  a  loss  m  the  productive  capacity  and 
their  effect  on  lot  sizes  is  the  most  significant  when  the  capacity 
utilization  Is  high.  On  the  other  hand,  work-in-process  inventories 
tend  to  Increase  with  Increasing  capacity  utilizations.  Therefore. 
Ignorance  of  setup  times  will  resutt  In  overestimated  lot  sizes  due  to 
underestimation  of  work-in- process  Inventories. 

The  mathematical  programming  formulation  of  this  problem  is 
developed  where  the  objective  function  Is  nonlinear  with  a  linear  set 
of  constraints.  Setting  the  cycle  time  to  a  fixed  value,  we  first 
linearize  the  objective  function.  By  using  the  dual  problem  and 
complementary  slackness,  the  optimal  solution  of  this  problem  and 
thus  the  optimal  cycle  time  for  the  two  product  -  two  stage  problem 
are  obtained.  Besides,  we  have  the  exact  terms  for  the  work-in¬ 
process  Inventories  (queueing  Inventories:  inventory  that  built  up  on 
the  previous  stage  if  the  successor  stage  Is  busy  with  processing  the 
other  products)  since  they  can  be  expressed  explicitly  as  analytical 
functions  of  the  cycle  time.  Then,  we  generalize  our  resutt  to  multi¬ 
product  case  in  a  two  stage  system  which  constitutes  a  basis  for  the 
analysis  of  the  m-product.  n-stage  economic  lot  scheduling 
problem. 


Key  words:  Economic  Lot  Scheduling  Problem,  mutft-stoge 
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t'us  paper  we  present  computational  experience  with  a  primal-dual  interior  point  for  smooth  convex 
programming  problems  of  the  type 

min  cT~ 

s.t.  (1) 

g[x)  <  0, 

where  c  £  Rn  and  g  :  11°  — •  £lm  Is  a  vector-valued  function.  We  assume  that  each  component  g,  is 
convex  Let  s  t  F.m,  be  the  vector  of  slack  variables.  The  inequality  constraints  in  (1)  are  replaced  by 

g(x)  i  )=0,  s  >  0 

Given  a  parameter  a  >  0,  we  associate  with  (1)  the  barrier  proolem 

m 

rrun  <:Tx  -jiT  In  s, 

•  ~i 

s.t.  (2) 

jyz)  s  =  u 
s  >  0. 

vVe  iSoume  that  Slater's  condition  bolds: 


."assumption  G.I  There,  i janit  R11  rue/ 1  that  3(1)  <  0. 

V/e  also  arsumo 

Assumption  0.2  The  set  {z  :  g-,x)  <  0  and  crx  <  J)  ts  hounded  for  cxi  6. 

Under  these  assumptions  Problem  (2)  has  a  solution.  The  necessary  and  sufficient  conditions  for  opti¬ 
mality,  namely  the  Karush-Kuhn-Tueker  equations,  or  KKT  equations,  are 


with  s  >  0  and  y  >  0.  Here 

is  the  Jacobian  matrix  of  g  and  y  6  ftm  is  a  vector  of  dual  variables. 
Let 

F  :  Rm  x  Rm  x  R"  —  Rm  x  Rm  x  R" 
be  a  multi-valued  function  defined  by 


V  3  —  fie  = 

0 

(J) 

six)  +  >  - 

0 

(4) 

(M)r>+‘  - 

0, 

(5) 

dg  f  %(x) ' 

dx  1  dxj 

1 

F'\ 

F(x)  =  \Fr  = 
\fJ 


'Yt-tic' 

BL 

*7 

BL 


with  z  —  (y,  s,  x).  F  also  depends  on  the  parameter  p  >  0.  With  this  notation,  the  KKT  system  is  simply 

F{c)  =0. 

We  also  introduce  the  Lagrangean 


£(y;x,  t)  =  cTx  +  yT(y(x)  +  *). 


(6) 


The  KKT  system  (3)  -  (5)  can  be  rewritten  as 
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The  algorithm  goes  as  follows:  Given  an  interior,  but  not  necessarily  feasible,  point,  we  compute  the 
search  direction  dz  associated  with  n.  Then  a  step  is  taken  along  that  direction  such  that  the  intenor 
property  is  maintained.  Namely,  let  <5  :=  max{o  :  y  +  ody  >  0,  a  +  ads  >  0}  and  let  0  <  -y  <  1 .  Then 
the  next  iterate  is  given  by 

x  :=  x  +  yadx 
»  :=  #  +  -yadi 
V  :=  V  +  7®dy, 

Te  choice  of  is  adaptive.  For  “normal"  steps,  we  take  n  =  If  miny.s,  <  7*^,  the  vector  V'j  is 
considered  excessively  unbalanced  and  we  take  n  =  This  step  is  named  “centering”. 

We  tested  our  algorithm  on  a  sample  of  medium  size  random  problems.  We  primarily  studied  the  effect 
of  varying  the  size  of  the  problems.  We  observed  that  the  number  of  iterations  increases  slowly  with  the 
number  of  constraints  and,  surprisingly  enough,  it  decreases  with  the  number  of  free  variables  in  the  case 
of  quadraticaliy  constrained  problems. 

We  analyzed  the  influence  of  centering  and  showed  it  to  be  positive.  We  also  studied  alternative  strategies 
for  the  step  size.  It  turns  out  that  taking  a  fixed  fraction  of  the  maximal  step  size  works  well  in  practice. 
Moreover  the  fraction  can  be  extremely  dose  to  1  without  any  negative  effect  on  the  performance  of  the 
method.  Finally,  we  looked  at  different  choices  for  the  starting  point. 

We  applied  this  algorithm  to  linear  programming  problems.  The  algorithm  behaves  a  bit  differently  than 
with  quadratic  constraints.  The  iteration  count  increases  both  with  the  number  of  constraints  and  the 
number  of  free  variables.  For  the  former  the  increase  is  slower.  The  figures  are  reasonable. 
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1  INTRODUCTION 

Mathematical  programming  and  theory  of  scheduling  have  a  lot  of  optimiza¬ 
tion  problems  which  are  NP-hard  in  spite  of  their  very  simple  structure.  Thus  these 
problems  are  considered  to  be  difficult  to  solve.  But  some  of  them  are  easy  in  the 
sense  that  there  are  straightforward  ways  to  generate  feasible  solutions  of  them,  e.g. 
the  knapsack  problem,  the  TSP  problem  and  many  scheduling  problems. 

One  of  them  is  the  scheduling  of  identical  parallel  machines  where  the  max¬ 
imal  completion  time  has  to  be  minimized.  This  problem  is  the  topic  of  this  ex¬ 
perimental  study.  It  has  several  heuristics.  The  two  basic  ones  are  Graham’s  list 
scheduling  and  the  multi-fit  algorithm.  There  are  known  upper  bounds  for  the 
relative  accuracy  of  the  heuristic  solutions  provided  by  these  methods.  The  two 
algorithms  have  quite  different  strategies.  This  is  the  reason  that  some  problems 
worst  from  the  point  of  view  of  list  scheduling  can  be  solved  exactly  by  the  multi-fit 
algorithm  and  vice  versa.  This  gives  the  question  that  how  bad  accuracy  can  have 
the  better  of  the  list  scheduling  and  the  multi-fit  solutions.  This  was  the  initial  ques¬ 
tion  of  this  research.  Another  algorithm  called  intercl  .nging  method  has  been  also 
investigated.  The  research  made  necessary  to  sharpen  the  well-known  lower  bound 
of  the  optimal  value  of  the  ob  jective  function,  too. 

2  THE  SCHEDULING  PROBLEM 

In  the  classical  problem  of  the  scheduling  of  parallel  machines  n  jobs  have 
to  be  distributed  among  m  identical  machines  in  such  a  way  that  the  makespan  is 
minimal. 

The  whole  operation  starts  at  time  0.  The  machine  independent  processing 
times  are  denoted  by  pj(j  =  l,...,n)  which  are  positive  integers.  It  is  easy  to  see 
that  there  is  at  least  one  optimal  solution  such  that  the  machines  start  to  work  at 
t=0  and  are  working  without  any  idle  time  until  all  jobs  assigned  to  them  have  been 
finished. 

Let  Cj  be  the  completion  time  of  job  j.  The  maximal  completion  time,  i.e. 
max{C;  :  j  =  l,...,n},  is  denoted  by  C". 

Theorem  1  [Graham  69],  [Coffman  et  at.  78}  In  any  problem 

pj,ma.x{p}  :j  =  l,...,n}} 

<  C*  <  (1) 

maxI  m  £j=i  Pi»  niax{j»j  :j  =  1,-. ,«}}.□ 
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The  interval  in  which  the  optimal  value  must  lie  is  denoted  by  [L,U\,  i.e. 

1  ” 

L  -  fmax{—  Y'p},max{pJ  (2) 

and 

2  11 

0  -  [max{—  :  j  =  t3) 

111 

J~l 

Both  the  list  scheduling  and  the  multi-fit  algorithm  start  with  the  deter¬ 
mination  of  the  nonincreasing  order  of  the  processing  times.  The  two  algorithms 
assign  the  jobs  to  machines  in  that  order.  Therefore  without  loss  of  generality  it 
may  be  assumed  that 


Pi  >  Pi  >  >  Pn 


(4) 


The  rule  of  the  list  scheduling  is  that 

every  job  is  assigned  to  a  machine  having  minimal  current  load.  (£5) 


Theorem  2  [Graham  69 J  LctC(LS)  be  the  value  of  the  solution  provided  by  the  list 
scheduling.  Then 


C{LS)  4 
C’  "  3 


(5) 


Theorem  3  [ Graham  69]  If  there  is  an  optimal  solution  which  assigns  to  each  ma¬ 
chine  at  most  2  jobs,  thru  the  solution  given  by  the  list  scheduling  is  optimal.  □ 


The  multi-fit  algorithm  consists  of  two  parts.  A  greedy  method  is  the  in¬ 
ternal  part  and  a  logarithmic  scan  h  is  the  external  part  which  organizes  the  ap¬ 
plications  of  the  greedy  method.  I'or  the  internal  part  an  upper  bound  K  of  the 
optimal  value  is  assumed.  The  greedy  method  assigns  each  job  to  the  first  machine 
into  it  fits  not  exceeding  the  upper  bound  l\ .  In  the  external  part  a  current  lower 
bound  and  a  current  upper  bound  are  assumed  and  are  denoted  by  Ic  and  uc.  For 
the  internal  part  K  is  chosen  as  If  the  greedy  method  was  able  to  find  a 

solution  not  worst  then  K,  then  uc  becomes  [/("J,  otherwise  Ic  =  [  IC\ .  The  process 
is  repeated  until  the  condition 

uc  —  Ic 

is  not  satisfied.  Notice  that  it  follows  from  the  assumption  of  the  integrality  of  the 
processing  times  that  the  number  of  applications  of  the  greedy  method  is  0(log(U  - 
£)).  Thus  the  multi-fit  algorithm  is  polynomial. 


Theorem  4  / Friesen  84]  Let  C(M  F)  be  the.  value  of  the  solution  provided  by  the 
multi- fit  algorithm.  Then 


C(MF) 

('■ 


<  1.2  □ 


(6) 
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A  third  heuristic  method  called  interchanging  algorithm  has  been  applied 
in  this  research.  It  makes  the  following  steps  starting  from  any  solution.  It  inter¬ 
changes  one  job  of  the  most  loaded  machine  with  one  job  of  another  machine.  The 
interchange  is  possible  if  and  only  if  the  maximal  completion  time  is  decreased  in 
this  way.  Let  s  and  t,  resp.,  be  indices  of  the  most  loaded  and  the  other  machines, 
resp.  The  current  load  of  the  machines  are  denoted  by  L,  and  Lt.  Suppose  that  the 
job  i  of  machine  s  is  interchanged  with  job  j.  Then  the  following  two  conditions 
must  hold 


Pr  >  P]  (?) 

and 

+  p,  -  p j  <  Ls.  (8) 

O 

In  the  current  version  if  a  possible  interchange  is  found  then  it  has  been  executed. 
The  order  of  checking  Conditions  (7)  and  (8)  is  as  follows.  The  jobs  of  the  most 
loaded  machine  are  compared  with  the  jobs  of  another  machine  faking  the  other 
machines  in  an  increasing  load  order.  The  jobs  of  the  two  machines  are  taken  in  a 
decreasing  processing  time  order.  One  j<  of  the  most  loaded  machine  is  compared 
with  all  of  the  jobs  of  the  other  machine.  If  no  possible  interchange  is  found  then  the 
next  job  of  the  most  loaded  machine  is  taken.  The  number  of  comparisons  of  one 
iteration  are  0(n2).  To  get  a  polynomial  algorithm  the  number  of  interchanges  has 
been  limited  by  m  +  2.  In  the  current  version  the  solution  provided  by  list  schedul¬ 
ing  is  the  starting  point.  This  algorithm  is  one  of  simplest  possible  interchanging 
methods.  In  more  general  a  subset  of  jobs  ran  be  interchanged  for  another  subset  of 
jobs.  In  that  case  the  complexity  of  the  select  ion  of  the  two  subsets  is  much  higher. 

3  IMPROVEMENTS  OF  THE  LOWER  BOUND 


The  randomly  generated  problems  have  not  been  solved  with  any  kind  of 
enumerative  methods.  One  easy  way  to  prove  the  optimality  of  a  solution  is  that 
the  value  of  it  and  the  lower  bound  coincide.  Therefore  it  was  important  to  find 
some  ways  to  improve  the  lower  bound. 

In  (2)  only  two  information  are  taken  into  consideration,  the  average  load 
and  the  maximal  processing  time.  The  following  two  sharpening  of  the  lower  bound 
are  based  on  the  fact  that  what  is  the  number  of  jobs  which  must  be  assigned  to 
certain  machines. 


Theorem  5  Assume  that  (4)  holds.  Then 

Cm  >  Pn-\%]+i  +  +  Pn  □ 

Theorem  6  Assume  that  (4)  holds.  Let 


r 


n  — 


0) 
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.  /;  r  >  1 

C*  >  £  p\  a  (10) 

i  r  j=— f£l  J 

In  some  cases  there  are  jobs  which  are  not  effecting  C‘,  because  their  pro¬ 
cessing  times  are  relatively  very  small.  In  that  cases  the  following  observation  is 
useful. 


Theorem  7  Lei  S  be  any  subset  of  the  jobs.  Let  6c  any  lower  bound  for  the 
problem  defined  by  the  jobs  those  m  S.  Then  Ls  is  a  lower  bound  for  the  original 
problem.  □ 


Theorem  8  Let  k  be  any  index  with  1  <  k  <  n.  Assume  that:  (i)  the  list  scheduling 
has  assigned  until  that  point  exactly  k  jobs  to  machines ,  (ii)  none  of  the  machines 
has  more  than  two  jobs,  (hi)  pk-i  +  Pk - 1  +  Pk  ts'  at  least  as  great  as  the  current  load 
of  any  machine.  Then  the  current  maximal  load  is  a  lower  bound  for  the  optimal 
value  of  the  problem.  □ 


Theorem  9  Let  k  be  a  fixed  indtx  and 


t,  = 


lh 

LPfcJ 


Then 


f  Z%,‘, 
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J  =  1 . n. 


Pk  <  C\ 


A  COMPrTATIONAL  EXPERIENCES 


The  computational  experiences  have  been  made  in  three  phases.  In  the  first 
phase  about  S00.000  problems  belonging  to  different  c  lasses  have  been  generated, 
lit  this  phase  some  observations  have  been  made  which  modified  the  objectives  of 
the  research.  The  second  phase  was  the  main  one  in  which  1.200.000  problems 
have  been  generated  in  a  wide  range  of  problem  classes  to  find  difficult  problems. 
Further  attempts  have  been  made  to  find  more  difficult  problems  in  the  most  hopeful 
problem  classes. 


Definition  1  LciC(LS)  and  C(MF)  and  C(IC)  and  C‘  be,  resp.,  the  value  of  the 
solution  provided  by  the  list  scheduling  and  the  multi- fit  algorithm  and  the  inter¬ 
changing  method  and  of  the  optimal  solution,  resp.  A  particular  problem  is  called 
first  order  difficult  if  the  value 


min  {C(LS),C{MF)) 

C* 

is  high.  It  is  called  second  order  difficult  if  the  value 


mi  n{C(LS),C(IC),C(MF)}  min  jC(/C).C{MF)) 

C'  ~  C* 


(11) 


(12) 


is  high. 
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This  definition  is  not  correct  in  a  strict  mathematical  sense,  because  the 
meaning  of  the  word  "high"  is  undefined.  This  meaning  has  been  determined  during 
the  experiences. 

A  problem  class  is  determined  by  the  following  parameters:  m  -  the  number 
of  machines,  n  -  the  number  of  jobs,  p  -  the  maximal  possible  processing  time;  the 
processing  times  are  generated  randomly  by  the  fl,p]  integer  uniform  distribution. 

In  the  experiences  the  following  formulas  have  been  used  instead  of  (11) 

and  (12) 

min  {C(LS),C(MF)} 

z 

and 

min  {C(IC),C(MF)) 

- ; -  (14) 

L 

where  L  is  some  lower  bound  of  the  optimal  value  of  the  objective  function. 

4.1  Observations  of  the  First  Phase 

In  the  first  phase  only  the  list  scheduling  and  the  multi-fit  algorithm  have 
been  used. 

At  the  beginning  of  the  experiences  L  has  been  chosen  as  L.  Some  problems 
seemed  to  be  difficult  although  an  optimal  solution  has  been  obtained  by  one  of  the 
methods.  In  some  cases  this  fact  could  be  proven  by  one  of  the  improvements  of  the 
lower  bound  discussed  in  Section  3. 

Some  problems  had  just  the  opposite  behaviour.  Here  the  lower  bound 
coincided  with  the  optimal  value.  In  many  r.xsrs  this  fact  could  be  proven  bv  the 
interchanging  algorithm.  This  is  the  reason  that  this  method  had  to  be  involved 
into  the  investigations. 

Among  the  most  difficult  problems  found  in  this  phase  there  were  many 
such  that  the  smallest  processing  time  was  relatively  great.  Therefore  in  the  second 
phase  of  the  experiences  the  generation  of  the  the  problems  has  been  modified  as 
follows.  The  first  thousand  problems  has  been  generated  as  earlier.  In  the  case  of 
the  problems  of  the  second  thousand  the  processing  times  were  increased  by  1,  in 
the  case  of  the  third  thousand  by  2,  e.t.c.  This  cannot  be  applied  for  all  of  the 
classes,  because  in  some  cases  if  the  increase  is  not  less  than  a  certain  value,  the 
problem  regardless  the  generated  random  numbers  becomes  trivial. 

The  problems  which  seemed  to  be  difficult  were  belonging  to  two  different 
categories.  The  first  one  is  the  set  of  first  order  difficult  problems.  The  most  difficult 
problem  in  this  sense  was  the  following,  n  —  10,  m  —  3  and  the  processing  times  are 
30,  29,  24,  18,  17,  17,  17,  14,  13,  13.  The  solution  provided  by  the  list  scheduling 
is  as  follows:  Ml:  30,  17,  13;  M2:  29,  17,  14;  M3:  24,  18,  17,  13.  The  multi-fit 
solution  is:  Ml:  30,  29,  13;  M2:  24,  18,  17,  13;  M3:  17,  17,  14.  Both  of  them  have 
the  value  72.  But  the  optimal  solution  is  the*  following:  Ml:  30,  17,  17;  M2:  29,  IS. 
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17;  M3:  24,  14,  13,  13.  The  value  of  it  is  64.  Since  that  time  Definition  1  has  had 
the  meaning  that  a  problem  is  first  order  difficult  if 

min{C(  LS),C[M  F)}  9 

C-  >  8' 

The  computationally  difficult  problems  belong  to  the  second  category.  In 
the  case  of  such  a  problem  it  is  difficult  either  to  find  the  optimal  solution  or  to 
prove  the  optimality  of  a  solution  generated  by  one  of  the  heuristics. 

4.2  Experiences  of  the  Main  Phase 

In  the  second  phase  an  intensive  search  has  been  carried  out  for  difficult 
problems.  100.000  problems  have  been  generated  in  each  of  the  problem  classes. 
The  generated  solutions  are  within  105%  and  even  101%  of  the  improved  lower 
bound  in  the  case  of  a  very  great  part  of  the  problems  in  each  class.  These  results 
are  summarized  in  Table  1. 

it  turned  out  that  none  of  the  list  scheduling  and  the  multi-fit  algorithm  is 
superior  to  the  other  one.  This  is  indicated  by  the  numbers  of  problems  such  that 
the  appropriate  heuristic  solution  is  within  101%.  The  number  of  problem  classes 
for  which  a  method  is  superior  to  the  other  one  is  approximately  is  the  same  for 
both  algorithms.  The  behaviour  of  both  methods  are  very  different  in  the  different 
classes.  But  the  "the  better  of  list  scheduling  and  multi-fit”  seems  to  be  much  stable. 


j  ')/'»/!> 

101% 

LS-MF 

^  101% 
IC-MF 

.  105%“ 

LS-MF 

foi^r 

IC-MF 

1 0/3/ 1 5 

88067 

94179 

98817 

99897 

15/3/15 

95177 

99441 

99997 

99999 

85831 

97418 

99815 

15/3/30 

89662 

99002 

99999 

100000 

10/3/60 

45787 

69949 

94304 

iwmmrni 

80167 

98599 

99996 

100000 

30/3/15] 

99910 

100000 

100000 

99917 


99132 


98244 


89275 


100000 


960051 


S7.27 


100000 


98245 


97461 


100000 


1043707 


94.88 


100000 


100000 


99043 


99995 


100000 


1089569 


99.05 


100000 


100000 


99044 


100000 


100000 


1098337 


99.85 


Table  1:  The  numbers  of  problems  having  good  heuristic  solution 
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parameters 

MF 

LS 

10/3/15 

87795 

1318 

IHEKZfia 

7324 

93662 

73529 

950 

WKBffiEl 

12688 

86078 

WKEfflM 

44898 

1568 

23804 

72245 

1— fiff 

26577 

99725 

50626 

99591 

WKMBESl 

42090 

99132 

97997 

iii^pfpp 

3649 

87973 

100000 

99111 

Table  2:  Comparison  of  the  list  scheduling  and  multi-fit  heuristics 

The  most  first  order  difficult  problem  which  has  been  found  in  this  phase  L 
the  following,  n  =  10,  m  =  3  and  the  processing  times  are  15,  14,  12,  9,  8,  8,  8.  7. 
6,  6.  The  solution  provided  by  the  list  scheduling  is  as  follows:  Ml:  15.  8,  C,  6;  M2: 
14,  8,  7;  M3:  12,  9,  8.  The  multi-fit  solution  is:  Ml:  15,  14,  6.  M2:  12,  9,  8,  6;  M3. 
8,  8,  7.  Both  of  them  have  the  value  35.  But  the  optimal  solution  is  the  foilowine 
Ml:  15,  8,  8;  M2:  14,  9,  8;  M3:  12,  7,  6,  G.  The  value  of  it  is  31. 

4.3  Further  difficult  Problems 

The  aim  of  the  third  phase  has  been  io  find  further  difficult  problems.  Some- 
new  problem  classes  are  introduced,  because  it  is  likely  on  the  basis  of  the  previous 
experiences  that  these  classes  contain  the  desired  items.  At  the  end  of  this  phase 
the  number  of  the  generated  problems  have  exceeded  2.000.000. 

The  class  19/8/15  contained  the  known  most  difficult  problem.  The  pro¬ 
cessing  times  of  it  are:  21,  21,  20,  20,  19,  18,  17,  17,  16,  16,  16,  16.  12.  12,  12,  11, 
11,  10,  10.  The  multi-fit  solution  is:  Ml:  21,  21;  M2:  20,  20:  M3.  19.  18;  M4:  17. 
17;  M5:  16,  16,  10;  M6:  16,  16,  10;  M7:  12,  12,  12;  M8:  11,11.  The  value  of  it 
is  42,  which  is  achieved  at  Ml  and  M5  and  M6.  The  solution  provided  by  the  list 
scheduling  with  value  43  is  this:  Ml:  21,  11,  11;  M2:  21,  12;  M3:  20,  12,  10;  M4: 
20,  12,  10;  M5:  19,  16;  M6:  18,  16;  M7:  17,  16;  MS:  17.  16;  In  the  optima;  solution 
the  completion  time  is  37  on  all  of  the  machines  except  the  last  one  where  it  is  36: 
Ml:  21,  16;  M2:  21,  16;  M3:  20,  17;  M4:  20,  17;  M5:  19,  18;  M6:  16,  11,  10;  M7: 
16,  11,  10;  M8:  12,  12,  12. 

The  development  of  the  accuracy  of  the  most  known  first  order  difficult 
problems  has  been:  |  <  <  p-  The  value  42/37,  which  in  not  proven  to 

be  an  upper  bound,  is  less  than  the  value  72/6!  guaranteed  by  the  algorithm  of 
(Friesen-Langston  86],  which  uses  many  operations  from  a  practical  point  of  view. 

There  was  no  improvement  in  the  position  of  most  second  order  difficult 
problem  in  this  phase. 
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4.4  Some  Other  Observations 

Some  other  observations  are  obtained  from  the  experiences.  An  important 
one  is  the  following.  If  L  ^  U  then  U  is  far  from  the  optimal  value.  The  10/5/15 
ciass  is  the  only  one  where  the  ratio  ( 1 3)  had  a  value  greater  then  1 .22.  The  observed 
greatest  value  is  1.42. 

The  aim  of  the  improvements  of  the  lower  bound  was  to  decrease  the  number 
of  cases  to  check.  In  Tabie  5  the  the  number  of  problems  which  have  been  proved 
to  be  solved  within  101%,  and  tiie  observes  worst  (14)  ratio  observed  before  any 
improving  and  after  improving  (without  the  application  of  Theorem  9)  are  provided 
for  the  better  of  multi-fit  and  interchanging  procedure. 


parameters 

— 

■■Hj 

10/3/15 

8807S 

89871 

1.217 

1.120 

15/3/15 

"  99344 

99344 

1.030 

i  .030 

10/3/30 

65089 

68313 

1 .262 

1.102 

15/3/30 

99098 

99098 

1.032 

1,032 

10/3/60 

44046 

■rifjHEBl 

1.211 

1.100 

99045 

99045 

1.032 

1.032 

30/3/15  I 

100000 

100000 

1.005 

1.005 

mmsmm 

100000 

100000 

1.007 

1.007 

100000 

100000 

1.005 

1.005 

10/5/15 

37056 

88469 

1.412 

1.200 

20/5/15 

97240 

97240 

1.040 

1.040 

60/5/60 

100000 

100000 

100000 

100000 

Table  3:  The  effect  of  the  improvements  of  the  lower  bound 
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Abstract 

Tabu  Search  is  a  metastrategy  for  guiding  known  heuristics  to  overcome  local  op¬ 
timality.  Successful  applications  of  this  kind  of  metaheuristic  to  a  great  variety  of 
problems  have  been  reported  in  the  literature.  Recently  some  implementations  et 
tabu  search  on  parallel  computers  have  come  up.  Whereas  these  implementations 
are  tailored  to  specific  problems  we  attempt  to  provide  ideas  for  a  more  general 
concept  for  developing  parallel  tabu  search  algorithms. 


1  Introduction 

Due  to  the  complexity  of  a  great  variety  of  combinatorial  optimization  problems,  heuristic 
algorithms  are  especially  relevant  for  dealing  with  large  scale  problems.  The  main  draw 
back  of  algorithms  sucli  as  deterministic  exchange  procedures  is  their  inability  to  continue 
the  search  upon  becoming  trapped  in  a  local  optimum.  This  suggests  consideration  o" 
recent  techniques  for  guiding  known  heuristics  to  overcome  iocai  optimality.  Following 
this  theme,  the  application  of  the  tabu  search  metastrategy  for  solving  combinatorial 
optimization  problems  is  investigated. 

The  key  issue  in  designing  parallel  algorithms  is  to  decompose  the  execution  oi  the 
various  ingredients  of  a  procedure  into  processes  executable  by  parallel  processors.  Impro¬ 
vement  procedures  like  tabu  search  or  simulated  annealing  at  first  glance,  however,  have 
an  intrinsic  sequential  nature  due  to  the  idea  of  performing  the  neighbourhood  searcn 
from  one  solution  to  the  next.  Therefore,  there  is  not  yet  a  common  or  generally  applica¬ 
ble  parallelization  of  tabu  search  in  the  literature.  In  the  sequel  we  attempt  to  describe 
some  general  ideas  and  a  classification  scheme  for  parallel  tabu  search  algorithms. 

In  Section  2,  we  present  an  outline  of  tabu  search.  Before  describing  some  concepts 
for  parallel  tabu  search  algorithms  in  more  detail  (see  Section  4),  we  briefly  discuss  some 
of  the  common  parallel  machine  models  and  algorithms  ill  Section  3.  Some  examples 
are  given  in  Section  5  and  finally  some  conclusions  are  drawn  (Section  6).  The  attempt, 
of  course,  is  not  to  give  a  complete  treatment  of  parallel  tabu  search  but  to  sketch  the 
potential  this  area  of  research  carries.  For  a  more  detailed  treatment  of  the  ideas  of  this 
paper  see  Vo($  (l!)f)‘2). 


2  Tabu  Search 


Many  solution  approaches  are  characterized  by  identifying  a  neighbourhood  of  a  given 
solution  which  contains  other  ( transformed)  solutions  that  can  be  reached  in  a  single 
iteration.  A  transition  from  a  feasible  solution  to  a  transformed  feasible  solution  is  referred 
to  as  a  move  and  may  be  described  by  a  set  of  one  or  more  attributes.  In  a  zero-one 
integer  programming  context,  e  g.,  these  attributes  may  be  the  set  of  all  possible  value 
assignments  or  changes  in  such  assignments  for  the  binary  variables.  (Then  two  attributes 
denoting  that  a  certain  binary  variable  is  set  to  1  or  0,  may  be  called  complementary  to 
each  other.)  Following  a  steepest  descent /mildest  ascent  approach,  a  move  may  either 
result  in  a  best  possible  improvement  or  a  least  deterioration  of  the  objective  function 
value.  Without  additional  control,  however,  such  a  process  can  cause  a  locally  optimal 
solution  to  be  re-visited  immediately  after  moving  to  a  neighbour. 

To  prevent  the  search  from  endlessly  cycling  between  the  same  solutions,  tabu  search 
may  be  visualized  as  follows.  Imagine  that  the  attributes  of  ail  moves  are  stored  in  a  run- 
titng  list ,  representing  the  trajectory  of  solutions  encountered.  Then,  related  to  a  sublist 
of  the  running  list  a  so-called  tabu  list  may  be  defined.  Based  on  certain  restrictions,  it 
keeps  some  moves,  consisting  of  attributes  complementary  to  those  of  the  running  list, 
which  will  be  forbidden  in  at  least  one  subsequent  iteration  because  they  might  lead  back 
to  a  previously  visited  solution.  Thus,  the  tabu  list  restricts  the  search  to  a  subset  of  ad¬ 
missible  moves  (consisting  of  admissible  attributes  or  combinations  of  attributes).  This 
hopefully  leads  to  ’good’  moves  in  each  iteration  without  re  visiting  solutions  already 
encountered.  A  general  outline  of  a  tabu  search  procedure  (for  solving  a  minimization 
problem)  may  be  described  as  follows: 

Tabu  Search 

Given:  A  feasible  solution  x*  with  objective  function  value  z*. 

Start:  Let  x  :=  x*  with  z(x)  =  z‘. 

Iteration: 

while  stopping  criterion  is  not  fulfilled'do  begin 

(1)  select  best  admissible  move  that  transforms  x  into  x'  with  objective  func¬ 
tion  value  z(x')  and  add  its  attributes  to  the  running  list 

(2)  perform  tabu  list  management:  compute  moves  to  be  set  tabu,  i.e.,  update 
the  tabu  list 

(3)  perform  exchanges:  x  :=  x',z(x)  =  z(x') 

if  z(x)  <  z*  then  z*  :=  z(z),x+  :=  x  endif 

endwhile 

Result:  x*  is  the  best  of  all  determined  solutions,  with  objective  function  value  z\ 

*  *  * 

For  a  background  on  tabu  search  and  a  number  of  references  on  successful  applications 
of  this  mctahcuristic  see,  e.g.,  Glover  (1989,  1990),  Glover  and  Laguna  (1992),  and  VoB 
(1992). 

‘A  possible  stopping  criterion  can  be,  e.g.,  a  prespecified  time  limit. 
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Tabu  List  Management 

Tabu  list  management  concerns  updating  the  tabu  list,  i.c.,  deciding  on  how  many  and 
which  moves  have  to  be  set  tabu  within  any  iteration  of  the  search.  Up  to  now,  the  most 
popular  approach  in  literature  is  to  apply  static  methods  like  the  labu  navigation  method 

(TNM). 

In  TNM,  single  attributes  are  set  tabu  as  soon  as  their  complements  have  been  pan 
of  a  selected  move.  The  attributes  stay  tabu  for  a  distinct  time,  i.e.  number  of  iterations, 
until  the  probability  of  causing  a  solution’s  re-visit  is  small.  The  eificiency  of  the  algorithm 
depends  on  the  choice  of  the  tabu  status  duration,  i.e.  the  length  tl^ize  of  the  labu  list. 
(In  the  literature  often  a  ’magic’  tl_size=7  is  proposed.)  For  the  sake  of  an  improved 
elfectivity,  a  so-called  aspiration  level  criterion  is  considered,  which  permits  the  choice  of 
an  attribute  even  when  it  is  tabu.  This  can  be  advantageous  when  a  new  best  solution 
may  be  calculated,  or  when  the  tabu  status  of  the  attributes  prevent  any  move  from 
feasibility. 

The  static  approach,  though  successful  in  a  great  number  of  applications,  seems  to 
be  a  rather  limited  one.  Another  probably  more  fruitful  idea  is  to  define  an  attribute 
as  being  potentially  labu  if  it  belongs  to  a  chosen  move  and  to  handle  it  in  a  candidate 
list  first.  Via  additional  criteria  these  attributes  can  be  definitely  included  in  the  tabu 
list  if  necessary,  or  excluded  from  the  candidate  list  if  possible.  Therefore,  the  candidate 
list  is  an  intermediate  list  between  a  running  list  and  a  tabu  list.  Glover  (11*90}  suggests 
the  use  of  different  candidate  list  strategies  in  order  to  avoid  extensive  computational 
effort  without  sacrificing  solution  quality.  In  the  sequel,  we  sketch  the  following  dynamic 
strategies  for  managing  tabu  lists:  the  cancellation  sequence  method  (CSM,  in  a  revised 
version,  cf.  Dammcyer  et  al.  (1991)),  and  the  reverse  elimination  method  (REM). 

CSM  as  well  as  REM  both  use  additional  criteria  for  setting  attributes  tabu.  The 
primary  goal  is  to  permit  the  reversion  of  any  attribute  but  one  between  two  solutions 
to  prevent  from  re-visiting  the  older  one.  To  find  those  ci'ilic.al  moves,  CSM  needs  a 
candidate  list  that  contains  the  complements  of  attributes  being  potentially  tabu.  This 
active  tabu  list  (ATL)  is  built  like  the  running  list  where  elimination  of  certain  attributes 
is  furthermore  permitted.  Whenever  an  attribute  of  the  last  performed  move  finds  its 
complement  on  ATL  this  complement  will  be  eliminated  from  ATL.  All  attributes  bet 
ween  the  cancelled  one  and  its  recently  added  complement  build  a  cancellation  sequence 
separating  the  actual  solution  from  the  solution  that  has  been  left  by  the  move  that  con 
tains  the  cancelled  attribute.  Any  attribute  but  one  of  a  cancellation  sequence  is  allowed 
to  be  cancelled  by  future  moves.  This  condition  is  sufficient  but  not  necessary,  as  some 
additional  aspects  have  to  be  taken  into  account  so  that  CSM  works  well. 

The  method  works  well  for  the  case  that  a  move  consists  of  exactly  one  attribute,  i.e., 
when  so-called  single-attribute  moves  are  considered  instead  of  multi- attribute  moves.  In 
addition,  the  corresponding  parameters  have  to  be  chosen  appropriately  (e.g.  the  tabu 
list  duration  of  a  tabu  attribute,  and  how  to  apply  the  aspiration  level  criterion).  Ap¬ 
plying  CSM  to  multi-attribute  moves  needs  additional  criteria  to  prevent  errors  caused 
by  uncovered  special  cases.  E.g.  for  paired-attribute  moves  (moves  consisting  of  exactly 
two  attributes)  those  moves  must  be  prohibited  that  may  cancel  a  cancellation  sequence 
consisting  of  exactly  two  attributes  (because  none  of  them  is  tabu  when  choosing  a  move). 
In  addition,  for  building  a  cancellation  sequence,  the  remaining  attributes  of  the  older 
and  the  current  move  are  not  necessarily  taken  into  consideration.  This  depends  on  the 
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order  in  which  the  move's  attributes  are  added  to  ATL. 

The  conditions  of  TNM  and  CSM  need  not  be  necessary  to  prevent  from  re-visiting 
previously  encountered  solutions.  Necessity,  however,  can  be  achieved  by  REM.  The  idea 
of  REM  is  that  any  solution  can  only  be  rc-visitcd  in  the  next  iteration  if  it  is  a  neighbour 
of  the  current  solution.  Therefore,  in  each  iteration  the  running  list  will  be  traced  back 
to  determine  all  moves  which  have  to  be  set  tabu  (since  they  would  lead  to  an  already 
explored  solution).  For  this  purpose,  a  residual  cancellation  sequence  (RCS)  is  built  up 
stepwise  by  tracing  back  the  running  list.  In  each  step  exactly  one  attribute  is  processed, 
from  last  to  first.  After  initializing  an  empty  RCS,  only  those  attributes  are  added  whose 
complements  are  not  in  die  sequence.  Otherwise  their  complements  in  the  RCS  are 
eliminated  (i.e.  cancelled).  Then  at  each  tracing  step  it  is  known  which  attributes  have 
to  be  reversed  in  order  to  turn  the  current  solution  back  into  one  examined  at  an  earlier 
iteration  of  the  search.  If  the  remaining  attributes  in  tiie  RCS  can  be  reversed  by  exactly 
one  move  then  this  move  is  tabu  in  the  next  iteration.  For  single-attribute  moves,  for 
instance,  the  length  of  an  I1CS  must  be  one  to  enforce  a  tabu  move.  Correspondingly,  in 
a  slightly  modified  method  REM2  all  common  neighbours  of  the  current  solution  and  of 
an  already  explored  one  will  be  forbidden.  These  neighbours  were  implicitly  investigated 
during  a  former  step  of  the  procedure  (due  to  the  choice  of  a  best  rion-tabu  neighbour) 
and  need  not  be  looked  at  again  (cf.  VoB  (1992)). 

Obviously,  the  execution  of  REM  and  of  REM2  represents  a  necessary  and  sufficient 
criterion  tc  prevent  from  re-visiting  known  solutions.  Since  the  computational  effort  of 
REM  increases  if  the  number  of  iterations  increases,  ideas  for  reducing  the  number  of 
computations  have  been  developed  (cf.  Glover  (1990)  and  Dammeyer  and  VoB  (1991a)) 

For  applications  and  (sequential)  comparisons  of  TNM,  CSM,  and  REM  see  Dammeyer 
and  VoB  (1991b)  and  Domschke  et  al.  ( 1 992). 

Search  Intensification  and  Search  Diversification 

A  general  idea  for  reducing  the  computational  effort  in  a  tabu  search  algorithm  is  that  of 
search  intensification  using  a  so-called  short  term  memory.  Its  basic  idea  is  to  observe  the 
attributes  of  all  performed  moves  and  to  eliminate  those  from  further  consideration  that 
have  not  been  part  of  any  solution  generated  during  a  given  number  of  iterations.  This 
results  in  a  concentration  of  the  search  where  the  number  of  neighbourhood  solutions  in 
each  iteration,  and  consequently  the  computational  effort,  decreases.  Obviously  the  cost 
of  this  reduction  can  be  a  loss  of  accuracy. 

Correspondingly,  a  search  diversification  may  be  defined  as  a  long  term  memory  to 
penalize  often  selected  assignments.  Then  the  neighbourhood  search  can  be  led  into  not 
yet  explored  regions  where  the  tabu  list  operation  is  restarted  (resulting  in  an  increased 
computation  time).  An  appealing  opportunity  for  search  diversification  is  created  by  the 
idea  of  REM  and  REM2  resulting  in  REMl  for  l  >  2  and  integer.  If  at  any  tracing 
step  the  attributes  that  have  to  be  reversed  to  turn  the  current  solution  back  into  an 
already  explored  one  equal  exactly  l  moves  then  it  is  possible  to  set  these  moves  tabu 
for  the  next  iteration.  Note  that  for  the  case  of  multi-attribute  moves,  due  to  various 
combinations  of  attributes  to  moves,  even  more  than  l  moves  may  be  set  tabu  in  order  to 
avoid  different  paths  through  the  search  space  leading  to  the  same  solution.  Accordingly, 
search  diversification  is  obvious. 
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3  Parallel  Machine  Models 

Over  the  years  a  great  variety  of  architectures  have  been  proposed  for  parallel  computing 
The  most  widely  known  classification  of  parallel  machine  models  (although  somehow 
limited)  is  given  by  Flynn  ( 196C).  lie  distinguishes  four  general  classes  based  on  the  idea 
of  whether  single  or  multiple  instruction  streams  are  executed  on  either  one  or  multiple 
data  set  streams: 

•  SISD  (Single  Instruction ,  Single  Data)  including  the  classical  sequential  computers 

•  SIMD  (Single  Instruction,  Multiple  Data)  including  vector  computers  and  array 
processors 

•  MISD  (Multiple  Instructions ,  Single  Data) 

•  MIMD  (Multiple  Instructions,  Multiple  Data)  with  the  processors  performing  each 
successive  set  of  instructions  either  simultaneously  (synchronous)  or  independently 
(asynchronous) 

The  above  classification  of  parallel  machine  models  may  lead  to  different  classes  of 
parallel  algorithms.  Vectorized  algorithms  operate  uniformly  on  vectors  of  data  sets 
(SIMD).  Systolic  ones  operate  rhythmically  on  streams  of  data  sets  (SIMD  and  synchro¬ 
nous  MIMD).  Parallel  processing  algorithms  operate  on  a  set  of  synchronously  commu¬ 
nicating  parallel  processors  (synchronous  MIMD).  Correspondingly,  asynchronous  com¬ 
munication  leads  to  distributed  processing  algorithms  (asynchronous  MIMD  and  neural 
networks). 

In  addition  to  architectural  aspects  communication  networks  are  used  to  classify  par¬ 
allel  machine  models.  For  instance,  it  makes  a  difference  whether  processors  have  si¬ 
multaneous  access  to  a  shared  memory,  allowing  communication  between  two  arbitrary 
processors  in  constant  time,  or  whether  they  communicate  through  a  fixed  interconnection 
network.  Less  formally,  in  certain  models  it  is  assumed  that  there  is  a  master  processor 
controlling  the  communication  of  the  network,  with  the  remaining  processors  of  the  net¬ 
work  called  slaves.  For  a  comprehensive  survey  on  parallel  machines  and  algorithms  see 
e.g.  Akl  (1989)  and  Van  Leeuwen  (1990). 

The  quality  of  parallel  algorithms  may  be  judged  by  a  number  of  quantities,  the  most 
important  one  being  the  speedup,  which  is  the  running  time  of  the  best  sequential  imple¬ 
mentation  of  the  algorithm  divided  by  the  running  time  of  the  parallel  implementation 
executed  on  a  number  of  p  processors.  Similarly,  given  a  prespecified  time  limit  (cf.  foot¬ 
note  1)  a  scaleup  may  be  defined  as  Hie  ratio  of  the  average  problem  sizes  solvable  with 
a  parallel  implementation  to  a  sequential  implementation  of  the  algorithm.  With  heuri¬ 
stics,  the  solution  quality  attainable  may  also  be  measured.  The  processor  utilization  or 
efficiency  is  the  speedup  divided  by  p.  The  best  one  can  achieve  is  a  speedup  of  p  and  an 
efficiency  equal  to  one. 


4  Parallel  Tabu  Search  Algorithms 

Due  to  the  success  and  the  underlying  simplicity  of  the  main  idea  of  tabu  search,  recently 
some  implementations  on  parallel  computers  have  come  up  tailored  to  specific  problems. 
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Surprisingly,  to  tiie  best  of  our  knowledge,  they  are  solely  devoted  to  problems  using  ihe 
notion  of  paired-attribute  moves:  the  travelling  salesman  problem,  the  job  shop  problem, 
and  the  quadratic  assignment  problem  (compare  Section  5). 

In  a  first  step  we  shall  describe  a  classification  of  different  types  of  parallelism  that 
is  applicable  to  most  iterative  search  techniques  (cf.  VoB  (1992)).  Its  basis  is  the  idea  of 
having  different  starting  solutions  or  candidate  solutions  (so-called  balls,  motivated  by  the 
idea  of  mountains  like  solution  space  where  a  ball  is  rolling  to  find  a  stable  low  altitude 
state)  as  well  as  a  number  ot  different  strategies,  e.g.  based  on  various  possibilities  of  the 
parameter  setting  or  on  the  tabu  list  management. 

•  SBSS  (Single  Ball,  Single  Strategy ) 

The  algorithm  starts  from  exactly  one  given  feasible  solution  and  performs  its  moves 
following  exactly  one  strategy. 

•  SBMS  (Single  Ball.  Multiple  Strategies) 

I  lie  algorithm  starts  from  exactly  one  given  feasible  solution  by  the  use  of  different 
strategics  where  cadi  strategy  is  performed  on  a  dilfereni  processor. 

•  MBSS  (Multiple  Balls,  Single  Strategy) 

The  algorithm  starts  from  different  initial  feasible  solutions,  each  on  a  dilferenL  pro¬ 
cessor.  The  same  type  of  instruction,  i.e.  strategy,  is  performed  on  each  processor. 

•  MBMS  (Multiple  Balls.  Multiple  Strategies) 

Tlic  algorithm  starts  from  different  initial  feasible  solutions  performing  diderent 
strategies. 

In  wiial  follows  we  discuss  the  above  ideas  in  more  detail  with  special  emphasis  on 
further  principles  of  parallelism  within  specific  strategies.  For  ease  of  description  we 
assume  the  notion  of  parallel  or  distributed  processing  algorithms. 

SBSS 

The  single  ball,  single  strategy  idea  is  the  simplest  version,  and  obviously  corresponds  to 
the  idea  of  classical  sequential  compulations  (cf.  the  SISD-model).  This,  however,  does 
not  restrict  the  possibility  of  parallelization. 

Starling  from  an  initial  feasible  solution,  the  best  move  which  is  not  tabu  must  be 
performed.  The  search  for  this  move  may  he  done  in  parallel  by  decomposing  the  set  of 
admissible  moves  into  a  number  of  subsets.  F.g.  in  a  master-slave  architecture  each  (slave) 
processor  may  evaluate  the  best  move  in  a  specific  subset.  The  best  move  of  each  subset 
is  communicated  to  the  master  who  picks  the  overall  best  as  the  transformed  solution  and 
also  performs  the  tabu  list  management. 

To  restrict  the  amount  of  communication  necessary  for  synchronizing  the  data  each 
slave  could  determine  the  lies!,  possible  move  in  its  subset  without  observing  any  tabu 
list,  while  the  tabu  list  in  the  same  time  is  updated  by  the  master.  Then  the  master  picks 
among  ail  answers  the  best  which  is  not  tabu.  If  no  such  move  exists,  a  second  trial  must 
be  made  while  each  processor  has  to  receive  and  to  observe  the  tabu  list.  Otherwise  the 
next  iteration  is  to  be  performed. 
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Additional  ideas  may  be  developed  with  respect  to  the  specific  strategies,  in  TNM. 
the  tabu  list  management  may  be  done  by  eacli  processor  itself  by  simpiy  providing  the 
most  recent  move  (whoso  complement  will  be  in  the  list).  In  CSM,  the  master  builds  *,h ■« 
cancellation  sequences  and  partitions  them  to  Lite  slaves,  i.c..  every  slave  has  to  evaluate 
a  certain  number  of  sequences.  In  subsequent  iterations,  the  attributes  ol  the  curro.i-. 
moves  are  communicated.  Whenever  a  cancellation  sequence  is  reduced  to  I  it  wii  be 
re-communicated  to  the  master. 

SBMS 

In  SBMS  each  processor  executes  a  process  which  is  one  of  the  above  tabu  search  strategies 
with  different  tabu  conditions  and  parameters,  iikee.g.  REMt  for  various  i.  For  TNM  this 
can  be  different  (eventually  randomly  modified)  tabu  list  lengths;  for  CSM.  different  tabu 
durations  may  be  considered.  The  (slave)  processors  are  halted  after  a  prespecified  ’ime 
and  the  results  are  compared  and  the  best  one  is  calculated.  A  restart  is  possible  with  the 
best  or  a  good  seed  solution.  Each  strategy  may  take  a  different  path  through  tiie  search 
space  because  of  different  tabu  list  management  or  parameter  setting.  A  restart  may  be 
performed  either  with  empty  running  and  tabu  lists  or  with  a  previously  encountered  list. 

MBSS 

The  multiple  balls  approaches  start  from  at  most  ;>  (the  number  of  processors  available; 
different  initial  feasible  solutions,  whose  calculation  can  vary.  They  may  he  determined 
either  randomly  or  by  applying  different  heuristics  to  the  same  problem.  This  may  aivo 
incorporate  ideas  involving  different  diversification  and  intensification  strategies  as  des¬ 
cribed  above.  A  third  possibility  assumes  one  given  feasible  solution  and  starts  wilii  r. 
suitable  subset  of  its  transformed  (neighbourhood)  solutions.  (Especially  with  REM2  it 
may  be  assured  that  even  in  future  iterations  there  is  no  overlap  with  the  initial  feasible 
solutions  of  the  other  processors.)  The  single  strategy  approach  assumes  the  application 
of  exactly  one  tabu  search  algorithm  with  the  same  parameter  setting  for  all  processors. 

As  with  SBMS,  the  processes  may  be  halted  after  a  specific  time  period  to  coordinate 
their  results  and  possibly  to  initiate  a  restart  with  new  (hopefully)  improved  solutions 
If  the  processes  are  performed  synchronously,  then  the  stopping  may  be  initiated  after 
having  generated,  say,  rn  successive  moves.  On  synchronous  M1MD  machines  the  iatter 
approach  may  be  especially  relevant.  Note  that  the  above-mentioned  possibility  of  pa¬ 
rallelization  within  SBSS  is  related  to  a  method  with  m  =  1  where  the  best  transition  is 
evaluated.2  With  respect  to  MBSS,  this  modifies  to  the  evaluation  of  the  p  best  moves 
usable  for  a  restart.  For  m  >  2  this  approach  may  he  used  as  a  look  ahead  tnelhod. 

MBMS 

The  multiple  balls,  multiple  strategics  approach  subsumes  all  previous  classes,  allowing 
search  within  the  solution  space  from  different  starting  points  with  different  methods  or 
parameter  settings. 

JThia  gives  reference  to  incorporate  different  candidate  list  strategies.  (Note  the  correspondance  to 
ideas  of  beam  search,  cf.  Glover  (1990).) 
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6  Examples 

mi  the  sequel  we  sketch  some  of  1  !.e  ideas  given  in  the  previous  sections  with  respect  to  weii 
unown  combinatorial  oplum/a' ion  problems.  As  mentioned  above,  we  only  found  some 
work  op.  problems  with  the  idea  of  paired-attribute  moves  to  perform  the  neighbourhood 
sear -n  7/e  start  with  rcspeo1  N-  oinarv  integer  programming,  exploiting  single-attribute 
moves. 

Consider  the  SUSS  concept.  \iso  consider  n  decision  variables  in  a  binary  problem 
with  no  (implicit  or  explicit )  restriction  on  the  number  of  variables  set  to  either  1  or  0.  We 
may  define  simple  ADD-  <>r  DKOP-moves  by  complementing  the  corresponding  entries 
of  the  binary  variables  x,.  Assume  the  existence  of  n  -I-  2  processors  with  n  +  2  being 
the  master  processor.  The  tabu  list  management  is  performed  by  processor  n  +  1.  In 
any  iteration  of  the  search,  each  of  the  synchronously  controlled  processors  i  €  { 1, . . . ,  uj 
receives  the  information  whose  variables'  entry  has  been  chosen  to  be  exchanged  as  the 
most  recent  move.  This  move  is  performed  together  with  the  reversion  of  x,.  This  usually 
can  be  done  quite  efficiently  by  reconstructing  the  previous  solution  stored  at  i  with  at 
most  one  assignment  complemented.  Then  i  offers  its  objective  function  value  to  the 
Miaster  who  re-calis  all  results  of  processors  referring  to  non-tabu  moves  (evaluated  by 
processor  u-f  !).  Obviously  liiis  approacu  may  be  generalized  m  various  ways  to  the  more 
general  classes  described  above. 

This  concept  may  lie  applied,  eg.,  to  the  warehouse  location  problem  (WLP),  to 
Steiner's  problem  in  graphs  (Si5),  and  to  the  mullicorislraiut  zero-one  knapsack  problem 
(MCKP).  Eg-,  for  WLP  this  naghbomiiood  search  means  a  reallocation  of  costumers, 
i.e.,  opening  a  new  location  :  results  in  re  allocating  all  costumers  for  which  r  is  closer 
than  the  depot  currently  used.  Correspondingly,  closing  a  location  i  forces  all  costumers 
receiving  service  from  i  to  its  second  nearest,  location. 

An  even  more  challenging  reoptimization  problem  arises  within  SP.  There,  an  itera¬ 
tion  of  trie  neighbourhood  search  may  consist  of  changing  a  node-oriented  binary  variable 
and  calculating  a  minimum  spanning  tree  (MS  P)  on  the  sol  of  all  nodes  with  entry  1  of 
the  corresponding  variables.  The  question  is,  whether  reoptimization  may  be  carried  out 
either  by  solving  the  modified  problem  anew  or  by  starting  from  a  previous  optimal  solu¬ 
tion  found  by  the  same  processor  (sec  Clover  ct  al.  (1992)  for  a  corresponding  sequential 
approach  with  respect  to  MS  I  ). 

If  the  number  or  weighted  number  of  variables  with  value  1  is  limited  (as  for  MCKP) 
or  fixed  (as  e.g.  in  the  p-median  problem)  then  the  same  approach  may  be  applied  with 
combined  ADD/DROP-  or  SWAP-moves  leading  to  paired-attribute  moves. 

Malek  et  al.  (1989)  follow  the  .SUMS  approach  to  solve  travelling  salesman  problems 
fTSP)  by  TNM  with  the  2-opl  exchange  as  moves.  The  tabu  attributes  follow  different 
strategies  in  that  they  are  restricted  either  to  one  or  to  the  two  cities  that  have  been 
swapped  or  to  the  cities  and  t  heir  respective  positions  in  tour.  In  addition  different  tabu 
parameters  were  used  on  different  processors.  For  another  parallel  tabu  search  algorithm 
for  the  TSP  see  Fiechler  ( i990). 

The  quadratic  assignment  problem  (QAP)  is  treated  by  Cliakrapani  and  Skorin-Kapov 
(1991,  1992)  by  the  use  of  SUSS  and  TNM  with  search  intensification  and  search  diversi¬ 
fication  performed  sequentially  while  evaluating  the  moves  in  parallel.  The  set  of  moves 
is  partitioned  into  disjoint  subsets,  each  one  on  a  different  processor  as  described  above. 
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The  neighbourhood  search  is  performed  by  pairwise  interchanges  such  that  fcr  t.’fu2  , 
processors  available  all  moves  can  be  evaluated  in  constant  time,  achieving  a  speedun  of 
0{nJ j  logn).  Battiti  and  Tecchioiii  (1992)  use  TNM  together  with  a  basiling  function 
and  compare  their  algorithm  also  with  a  parallel  genetic  algorithm.  Another  parallel  al¬ 
gorithm  lor  QAP  based  on  TNM  (with  randomly  varying  ll_size;  has  been  prer-eme-i  i>\ 
Taillard  (1991).  It  is  an  SBSS  approach,  too.  The  same  idea  has  also  been  applied  to  the 
job  shop  as  well  as  to  the  flow  shop  problem  (see  Taillard  (1989,  1990)).  The  latter  r 
fact,  also  describes  a  single-attribute  based  implementation  with  attributes  corresponding 
to  objective  function  values.  Chakrapani  and  Skorin-Kapov  (1992)  is  especially  rele\a:it 
since  its  implementation  is  based  on  a  connection ist  approach  related  to  a  Moil /.mam 
machine  (cf.  Aarts  and  Korst  (1989)). 

6  Conclusions 

We  have  summarized  some  ideas  for  developing  parallel  tabu  search  algorithms.  Motivated 
by  a  famous  classification  scheme  for  parallel  machine  models  we  proposed  a  cJassificatior 
scheme  for  parallel  tabu  search  algorithms.  While  research  in  this  field  is  still  in  its  infancy 
we  believe  that  reasonable  achievements  in  the  following  two  aspects  will  be  provided. 

•  Development  of  a  framework  for  a  general  parallel  tabu  search  algorithm  that  can 
be  applied  to  a  wide  range  of  combinatorial  optimization  problems. 

•  Empirical  results  for  parallel  tabu  search  algorithms  tailored  to  specific  problems. 

Some  results  known  from  the  literature  (cf.  Section  5)  support  this  feeling.  Despite  In- 
emphasis  on  parallel  tabu  search,  sequential  testing  is  still  far  from  complete.  In  addition, 
the  tabu  search  metastrategy  should  be  tested  on  different  classes  of  parallel  algorithms 
and  machine  models.  Especially  relevant  seems  to  be  a  comparison  of  algorithms  tai¬ 
lored  to  different  hardware  specifications  like  vector  computers  versus  synchronous  and 
asynchronous  M1MD  machines.  However,  one  should  take  into  account  identical  user 
specifications  with  respect  to  tabu  search  (e.g.  parameter  setting,  definition  of  the  neigh 
bourhood).  Note  that  our  classification  scheme  is  not  restriced  to  parallel  tabu  search, 
but  inay  be  applied  for  nearly  any  iterative  search  procedure,  such  as  simulated  annealing 
or  genetic  algorithms. 
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The  work  of  a  transport  company  (bus,  train,  etc.)  may  be 
represented  by  a  schedule  which  specifies  the  journeys  to  be 
undertaken.  Figure  1  is  a  graphical  representation  of  part  of 
such  a  schedule,  with  each  line  showing  the  times  that  a  service 
begins  and  ends,  and  each  '+'  showing  the  time  of  a  relief 
opportunity  at  which  the  driver  of  that  service  may  be  replaced 
by  another  driver.  An  indivisible  period  which  must  be  worked  by 
the  same  driver  (e.g.  between  two  consecutive  relief 
opportunities)  is  called  a  workpiece. 
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Each  driver's  working  day  consists  of  a  number  of  workpieces.  A 
complete  specification  of  a  drivers  working  day,  including 
sign-on,  sign-off  and  mealbreak  times,  is  called  a  duty.  Every 
transport  company  has  many  conditions  that  its  duties  must 
satisfy,  usually  called  the  "union  agreement".  This  agreement  may 
specify,  for  example,  the  maximum  length  of  a  working  day  and 
durations  of  mealbreaks.  There  is  usually  a  very  large  number  of 
different  duties  that  could  be  used  to  cover  a  schedule. 


607 


There  are  several  computer  systems  which  can  be  used  to  determine 
a  set  of  valid  drivers'  duties  to  cover  a  schedule  provided  by  a 
transport  company.  This  paper  will  consider  enhancements  that 
have  recently  been  devised  for  one  3uch  system  called  IMPACS 
(Integer  Mathematical  Programming  for  Automatic  Crew  Scheduling). 
This  system  was  developed  at  the  University  of  Leeds  by  Wren  & 
Smithll]  and  is  now  marketed  by  the  Hoskyns  Group.  IMPACS  has 
mainly  been  used  by  bus  companies  (throughout  the  world)  but  it 
has  also  been  used  by  train  and  tram  companies. 

At  the  heart  of  the  IMPACS  system  is  an  Integer  Programming  model 
which  has  two  pre-emptively  ordered  objectives:  to  minimise  the 
total  number  of  duties  used  to  cover  a  given  schedule  and  to 
minimise  a  cost  function  which  reflects  both  the  wage  cost  and 
undesirable  features  of  duties.  The  model’s  constraints  ensure 
that  all  workpieces  are  covered  at  least  once,  with  some 
specially  selected  workpieces  being  covered  exactly  once.  Also, 
each  duty  is  classified  according  to  its  type  (e.a.  early,  late, 
overtime)  and  side  constraints  can  be  added  which  limit  the 
number  of  duties  of  any  type  that  are  to  be  used. 

Thus,  the  model  is  of  the  mixed  set  covering/partitioning  type, 
possibly  with  the  addition  of  side  constraints.  Ongoing  research 
attempts  to  exploit  further  the  special  structure  of  the  IMPACS 
model  and  to  take  advantage  of  recent  developments  in 
mathematical  programming  algorithms. 
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The  IMPACS  model  has  previously  been  solved  using  the  following 
four -stage  process.  For  the  first  three  stages,  the  Linear 
Programming  relaxation  of  the  model  is  used. 

Stage  1  Minimise  the  total  number  of  duties  using  a  Primal 
Simplex  algorithm. 

Stage  2  Add  a  constraint  which  ensures  that  the  integral  number 
of  duties  does  not  increase  and  minimise  the  cost 
function  uring  a  Primal  Simplex  algorithm. 

Stage  3  If  the  total  number  of  duties  13  not  integral,  add  a 
suitable  constraint,  and  reoptimise  using  a  Dual 
Simplex  algorithm. 

Stage  4  Determine  an  integer  solution  using  Branch  and  Bound 
techniques  with  constraint  branching. 

Optimisation  within  the  IMPACS  system  is  based  on  Ryan's  ZIP 
package[2l.  The  performance  of  this  package  has  been  improved  by 
incorporating  Goldfarb  &  Reid's  Primal  Steepest  Edge  algonthm[33 
and  a  Dual  Steepest  Edge  algorithm  due  to  Forrest  &  Goldfarbl4) . 

This  paper  will  consider  a  new  strategy  for  solving  the  Linear 
Programming  relaxation.  Enhancements  to  stage  4  are  the  subject 
of  separate  work. 


Each  of  stages  1  and  2  of  the  previous  strategy  typically  involve 
a  large  number  of  Iterations,  resulting  in  the  time  to  solve  che 
Linear  Programming  relaxation  being  a  significant  proportion  of 
the  total  solution  time.  This  i3  due  to  the  objectives  for  stages 
1  and  2  being  different  and  the  high  degree  of  degeneracy 
inherent  in  the  model.  -Also,  the  constraint  that  is  added  at 
stage  2  is  fully  dense,  and  this  substantially  increases 
iteration  timings. 

These  difficulties  have  been  addressed  by: 

1.  Using  a  single  weighted  objective  function, 
and  2.  Solving  the  resulting  model  using  a  Dual  Steepest  Edge 
algorithm. 

The  weight  that  is  used  to  combine  the  two  objectives  is 
relatively  small,  and  is  determined  by  applying  an  algorithm  due 
to  Sherali[5]  to  the  IMPACS  model.  To  initiate  the  Dual  Simplex 
algorithm,  an  heuristic  has  been  developed  to  produce  initial 
basic  dual  feasible  solutions. 

The  paper  will  conclude  with  the  presentation  of  computational 
results  for  real  world  problems  with  numbers  of  constraints  in 
the  range  125  to  450  and  numbers  of  variables  in  the  range  4000 
to  11000.  The  results  suggest  that  the  new  strategy  significantly 
reduces  solution  timings. 


610 


references 


.  'Wren  A  S.  Smith  B  M  (1988)  Experiences  with  a  Crew  Scheduling 
Jvitem  cased  on  Set  Covering.  In  Computer-Aided  Transit 
heduling  (Eds.  Daduna  j  R  &  Wren  A),  pp.  160-174, 
pringer-Verlag,  Berlin. 

i.  lyan  D  M  (1980)  ZIP  -  A  Zero-One  Integer  Programming  Package 
for  Scheduling.  Report  CSS  85.  AERE.  Harwell,  Oxfordshire.  United 
Kingdom. 


j.  Goldfarb  D  &  Reid  J  K  >1977)  A  Practicable  Steepest  Edge 
Simplex  Algorithm.  Mathematical  Programming  4.  pp.  26-29. 

Forrest  J  J  &  Goldfarb  D  (1991)  Steepest  Edge  Simplex 
Algorithms  for  Linear  Programming.  IBM  Research  Report,  T  J 
Watson  Research  Center.  Yorktown  Heights,  New  York  10598,  USA. 

.i.  Sherali  H  D  (1982)  Equivalent  Weights  for  Lexicographic 
Multi-Objective  Programs:  Characterizations  and  Computations. 
EJOR  18.  pp.  57-61. 


Extended  Abstract 


Applied  Mathematical  Programming  and  Modelling  Symposium 
Budapest,  Hungary,  January  1993 
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This  talk  will  begin  by  discussing  duality  in  Mathematics  in  a  widei 
context  e.g.  in  the  areas  of  Set  Theory  and  Logic,  Projective  Geometry  anc 
Convex  Polytopes.  Some  of  the  mathematical  properties  which  are  normally 
expected  of  a  dual  will  be  listed  e.g.  Reflexivity  and  Symmetry.  Linear 
Programming  (LP)  and  Congruence  duality  will  then  be  examined  for  both  its 
mathematical  properties  and  computational  and  economic  uses  e.g.  Provide? 
Optimality,  Sensitivity  Analysis  and  Pricing  Imputation. 

A  number  of  possible  Integer  Programming  (IP)  duals  will  be 
mentioned  e.g.  the  Gomory-Baumol  dual,  Lagrangean  dual  and  Surrogate 
dual.  They  all  lack  some  of  the  above  properties  and  in  particular  do  not 
provide  a  guaranteed  proof  of  optimality. 

It  will  be  suggested  that  the  most  satisfactory  dual  arises  from 
examining  the  Value  Functions  and  Consistency  Testers  of  IPs.  For  Pure 
IPs  (PIPs)  these  take  the  form  of  Gomory  Functions.  Gomory  functions  are 
built  up  by  the  repeated  applications  of  the  operations  of 

(i)  Non-negative  linear  contributions. 

(ii)  Integer  round-up. 

(iii)  Taking  Maxima. 
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These  can  be  expressed  in  the  form 

Max(Cj,  C2,  Cn)  (1) 

where  the  C;  are  Chvdtal- Functions  which  are  built  up  from  operations  (i) 
and  (ii). 

By  comparison  t!»e  Value  function  of  the  Consistency  Tester  of  the 
corresponding  LP  relaxation  will  involve  functions  of  the  form 

Max(C,,  C2.  ...,  Cr)  (2) 

where  r  <  n  and  C,  is  obtained  from  C-  by  dropping  the  operation  (ii). 

The  Cj  will  therefore  be  non-negative  linear  conbinations  of  the  right- 
hand-side  coefficients,  arising  from  the  dual  vertices  of  the  LP. 

It  will  be  shown  that  those  Cj  which  correspond  to  a  Cj  in  (2)  can  be 
obtained  by  finding  the  Value  function  of  PIPs  over  cones.  This  may  be  done 
by  obtaining  the  Hermite  Normal  Form  of  the  corresponding  basis  matrix  for 
the  LP  relaxation.  The  resulting  doubly  recursive  function  of  the  right-hand- 
side  coefficients  gives  the  Value  function  (and  Consistency  tester).  It  is 
suggested  that  the  depth  of  this  recursion  is  a  measure  of  complexity.  The 
problem  of  extending  this  method  to  give  the  Value  function  and  Consistency 
tester  for  a  general  PIP  will  be  considered. 

It  will  be  shown  that  the  Value  function  for  a  Mixed  IP  (MIP)  is  not 
generally  a  Gomory  function  although  the  Consistency  tester  is.  By 
incorporating  this  objective  as  a  constraint  and  finding  the  consistency  tester 
of  this  system  it  is  then  possible  to  characterise  the  Value  function  of  the 
MIP. 

The  Value  function  for  certain  MIP  applications  has  considerable  economic 
importance  since  it  shows  how  indivisible  resources  should  be  "priced".  This 
aspect  will  be  considered  in  relation  to  the  Fixed  Charge  Problem  and  the 
Power  Systems  Loading  Problem. 


1  General  Problem  Description 


Analysts  frequently  face  the  following  problem:  given  a  multivariate  (possi¬ 
bly  correlated)  population,  how  does  one  determine  a  good  estimate  of  the 
probability  function  (or  some  number  of  its  moments)  for  a  complicated  func¬ 
tion  of  the  population's  variables?  The  primary  problem  to  consider  then  i* 
what  is  the  most  efficient  way  to  sample  from  the  input  population,  espe¬ 
cially  when  sampling  is  extremely  expensive  and  must  therefore  be  limited 
to  a  predetermined  (small)  sample  size.  The  desire  is  to  generate  a  sampling 
plan  which  will  be  representative  of  the  population,  and  produce  estimates 
of  moments  which  have  desirable  statistical  properties.  However,  since  the 
larger  the  sample,  the  larger  the  cost,  there  is  a  trade-off  between  generating 
the  best  estimates  and  reducing  the  amount  of  sampling.  In  order  to  obtain 
better  estimates  from  sampling,  analysts  may  determine  them  by  using  data 
collected  from  a  stratified  sampling  of  the  population. 

A  special  form  of  stratified  sampling  is  latin  hypercube  sampling. 
In  this  stratification,  the  cumulative  distribution  function  for  each  of  the 
n  population  variables  is  divided  into  m  blocks.  The  intersection  of  these 
blocks  makes  up  a  hypercube  having  mn  cells.  If  all  mn  cells  were  sampled, 
the  sampling  approach  would  be  a  “full  factorial  design”.  Since  sampling 
is  assumed  to  be  expensive,  LHS  limits  the  sampling  to  only  m  of  the  mn 
possible  cells.  Thus,  a  LHS  plan  is  not  a  hypercube,  but  is  equivalent  to  a 
m  x  n  matrix  such  that  each  of  the  m  rows  defines  one  sampling  cell  of  a  mn 
hypercube. 

The  ith  row  of  a  LHS  sampling  plan  makes  up  what  will  be  referred 
to  as  “run  i” .  Defining  this  grouping  as  a  run  is  motivated  by  the  fact 
that  typical  applications  of  LHS  involve  computer-based  models  where  the 
number  of  runs,  m,  is  predetermined.  To  ensure  that  a  plan  offers  a  cross 
section  of  the  sampling  space,  an  additional  feature  of  LHS  is  that  each  block 
of  each  variable  must  be  picked  once.  Thus,  each  column  of  a  LHS  plan  is  a 
permutation  of  the  numbers  1  to  m. 


Obtaining  Minimum-Correlation  Latin  Hypercube 
Sampling  Plans  Using  Discrete  Optimization  Techniques 

Dr.  Leslie- Ann  Yarrow 
Department  of  Mathematics  and  Statistics 
Brunei  University 

Uxbridge,  Middlesex  UBIO  3PH  England 

Combinatorial  optimization,  by  its  broad  nature,  has  been  used 
to  model  and  solve  a  variety  of  problems  including  those  arising  in  decision, 
engineering,  and  physical  sciences.  The  focus  of  this  work  is  to  consider 
the  solution  of  a  sampling  design  problem  using  combinatorial  optimization. 
The  particular  design  problem  of  interest  here  is  minimum-correlation  latin 
hypercube  sampling  (hereafter  referred  to  as  MCLHS).  The  central  point  of 
this  research  is  the  development  of  combinatorial  optimization  procedures 
which  provide  MCLHS  plans.  This  is  an  entirely  new  approach  for  finding 
MCLHS  plans. 

We  introduce  integer  programming  (IP)  formulations  of  this  problem 
and  develop  a  procedure  for  determining  minimum-correlation  sampling  de¬ 
signs.  We  provide  the  obvious  IP  formulation  of  the  MCLHS  problem  which 
results  in  a  problem  having  an  exponential  number  of  variables  and  a  large 
(polynomial)  number  of  constraints.  We  then  transform  the  problem  into 
a  sequence  of  assignment  problems  with  side  knapsack  equations,  having  a 
polynomial  number  of  variables.  This  decomposition  was  found  by  exploit¬ 
ing  the  special  structure  of  the  problem  and  finding  tight  objective  function 
lower  bounds.  We  note  that  even  after  the  decomposition,  the  problem  still 
belongs  to  the  NP-hard  class.  Although  the  decomposition  and  subsequent 
development  of  solution  procedures  for  the  smaller  problems  are  discussed 
within  the  context  of  the  sampling  design  problem,  the  approach  may  be 
applicable  to  various  permutation-related  IP  problems  such  as  the  general 
quadratic  assignment  problem,  assignment  problems  with  side  constraints, 
and  the  asymmetric  travelling  salesman  problem  variation  where  the  objec¬ 
tive  is  to  find  a  tour  which  meets  a  specific  cost  value.  Thus,  while  the 
research  presented  here  focuses  on  solution  approaches  for  the  MCLHS  prob¬ 
lem,  the  general  theory  and  findings  might  well  prove  useful  for  the  solution 
of  other  problems  known  to  be  NP-complete. 

We  begin  with  a  description  of  the  general  LHS  and  MCLHS  problems, 
followed  by  integer  programming  formulations  and  a  discussion  of  optimiza¬ 
tion  procedures  developed. 
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To  describe  the  standard  approach  to  LHS,  we  begin  by  writing  the 
vector  of  variables  as  (A\,  X2, ...,  Xn)  and  assume  for  the  time  being  that  the 
variables  are  mutually  independent.  The  range  of  each  Xt  is  divided  into  m 
(=  number  of  runs)  ascending  intervals  of  equal  probability  and  a  random 
value  is  drawn  on  each  interval  foe  each  variable.  Next,  we  generate  the  order 
in  which  the  m  values  of  each  variable  are  to  be  used  in  each  run  by  creating 
a  sequence  of  n  random  permutations  of  the  integers  1  to  m.  Finally,  we 
form  the  required  vector  for  the  ith  run  by  taking  the  r'*  number  from  each 
of  the  n  random  permutations. 

Latin  hypercube  sampling  plans  generated  by  the  standard  approach 
are  restricted  only  in  the  sense  that  for  each  variable,  a  vaiue  must  be  picked 
once  and  only  once  from  each  of  its  m  intervals.  A  point  we  have  not  yet 
considered  is  the  impact  that  correlations  between  the  columns  of  a  LHS  sam 
pling  plan  may  have  on  the  generated  estimates.  For  ease  of  explanation,  we 
will  continue  the  assumption  that  the  population  variables  are  mutually  in¬ 
dependent,  although  similar  results  are  obtained  for  any  given  population 
covariance  matrix.  For  the  rt  variables,  although  their  sampling  plan  permu¬ 
tations  are  determined  independently,  a  standard  LHS  plan  will,  in  general, 
have  some  level  of  correlation  between  the  pairs  of  permutations.  Thus,  the 
sampling  plans  will  not,  in  general,  parallel  the  correlations  of  the  true  joint 
distributions.  If  LHS  sampling  is  done  without  concern  for  the  correlation 
pattern  (or  lack  thereof),  the  estimators  cannot  be  guaranteed  to  be  unbiased 
or  even  consistent. 

The  desire  then  is  to  design  LHS  plans  which  incorporate  the  vari¬ 
ables’  true  pairwise  correlations.  For  two  variables,  A,  and  X3,  with  distri¬ 
bution  functions  having  strictly  positive  standard  deviations,  <7,  and  oy,  the 
correlation  coefficient  between  the  variables  is  defined  as 

cov(ATt,  A'j) 

Pij  ~  Wj 

where  cov(A",,  X})  denotes  the  covariance  between  variables  X,  and  Xr 


To  approximate  the  pairwise  correlation  coefficients  p,y  we  will  con¬ 
sider  the  correlation  coefficients  between  the  pairs  of  LHS  plan  permutations 
associated  with  variables  Xi  and  Xy  (The  two  forms  of  correlations  are  equal 
when  X{  and  X:  are  both  uniformly  distributed.)  For  permutations  of  the 
integers  from  1  through  rrc ,  it  can  be  shown  that  the  correlation  coefficient 
of  the  indices  of  any  pair  of  permutations  is 

f_,_  pi 

m(m2  —  1)  ’ 

where  Dv  is  the  difference  between  the  vih  integer  elements  in  the  vectors. 
This  is  known  as  the  Spearman  rank  correlation  coefficient  and  can  take  on 
values  in  the  interval  (—  l,  Ij.  The  expected  value  of  the  rank  correlation 
coefficient  is  0,  and  its  variance  is  I/(m  —  1).  Throughout  the  remainder 
of  this  paper,  we  denote  the  rank  correlation  estimate  between  the  column 
permutations  of  variables  X ,  and  X}  by  rtJ. 

For  illustration,  suppose  we  want  to  run  a  model  with  three  mutually 
independent  uniformly  distributed  variables  (for  simplicity,  x,  y,  and  x), 
each  to  be  represented  by  values  chosen  from  their  respective  sample  spaces. 
Assuming  further  that  we  are  allowed  only  six  runs,  consider  the  LHS  plan 
given  below: 

Table  1:  Latin  Hypercube  Sampling  Example 
Model  Run  Variable  Values 


X\ 

y  1 

*5 

x2 

ye 

^3 

X3 

ys 

24 

*4 

3/3 

2l 

*5 

3/2 

22 

*6 

3/4 

26 

The  rank  correlation  coefficients  for  this  example  are 

r12  =  0.00,  r23  =  0.00,  rl3  =  0.00, 

and  hence,  it  appropriately  models  the  mutual  independence  of  the  three 
variables.  If,  for  example,  the  variables  were  dependent  with  true  joint  dis¬ 
tributions  having  pairwise  rank  correlations  of  say,  r12  =  —.6,  r23  =  —.42, 
and  r13  =  .14,  then  this  particular  sampling  plan  would  not  suitably  parallel 
these  true  rank  correlations. 
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The  objective  of  the  restricted  LHS  problem  we  consider  is  the  selec¬ 
tion  of  column  permutations  which  attempt  to  meet  exactly  the  true  rank 
correlations  associated  with  the  variables.  In  this  way,  sampling  is  intended 
to  match  more  closely  the  true  marginal  distributions  of  the  input  vari¬ 
ables.  Specifically,  the  minimization  problem,  called  minimum-correlation 
LHS  (MCLHS),  provides  a  sampling  specification  minimizing  the  sum  of  the 
absolute  values  of  the  pairwise  differences  (rtJ  —  rtJ).  In  much  of  the  discus¬ 
sion  however  we  will  minimize  the  sum  of  the  absolute  values  of  the  pairwise 
rank  correlations  fy,.  This  models  the  situation  when  independence  of  the 
variables  is  likely  (i.e.,  rtJ  =  0). 


2  Integer  Programming  Models  for  MCLHS 


The  minimum-correlation  latin  hypercube  sampling  problem  described  can 
be  formulated  as  a  n-index  assignment  problem  with  side  knapsack  equation 
constraints  (APSEC).  To  begin,  define: 

1  if  iquj  . . .  vn  is  a  sampled  cell 

where  the  n-indices  on  the  x-variable, 

X 

t>i,  U2, . . . ,  un,  can  each  take  a  value  from  1  to  m 
,  0  otherwise 

and  also  df-,  d"  €  such  that: 

dfj  -  =  ( ftj  -  r,y) m(m2  -  l)/6  i  =  1 . . .  n,  j  >  i. 

Thus,  dfj  and  d  ij  are  the  positive  and  negative  magnitudes  of  the  devia¬ 
tions  from  the  true  rank  correlation  of  the  rank  correlation  between  column 
permuations  i  and  j. 

Equivalent  to  minimizing  the  sum  L«=i  !  ?ij  ~  rij  i>  is 

minimizing  the  objective  function 

f£  E  (<$+<$)}■ 

S=1  j>i  ' 


mm 


Although  the  formulations  described  below  are  applicable  to  cases 
with  nonzero  r,;,  for  ease  of  presentation,  we  will  assume  =  0.  It  can  be 
shown  that 

m(m2  -  l)f/6  =  m(m2  —  l)/6  —  D2 

will  be  integer- valued  for  all  pairs  of  permutations.  Thus,  we  can  now  define 
dfj,  d~  €  2j+  such  that: 

d+  —  d~  =  m(m2  -  1  )ry,/6  t  =]  ...  n,  j  >  i. 

In  order  that  the  IP  formulation  fully  encompasses  the  MCLHS,  it 
must  include  assignment  constraints  that  draw  a  one-to-one  correspondence 
between  the  positive- valued  xvi,..v#l  of  a  feasible  solution  and  n-tupies  of  col¬ 
umn  permutations.  Thus,  the  j:h  column  permutation  requires  the  m  assign¬ 
ment  constraints 

m  m  m 

£  £  Z  ...v«  =  1  VJ  =  \  ...m 

Vi=l  1^2  —  1  Vn  =  l 

N—  ^  ^ 

excluding 

Additional  constraints  are  needed  to  enforce  that 

m(m2  —  I )r«y /6  =  df}  —  d~  holds  for  ail  i  and  j,  i  <  j  <  n.  These  constraints 
are 

mm  m 

X!  S  Y,  (u<  “  ^)2^v,...Un  =  m(m2  —  1  )/6  Vt  <;  <  n 

Vj=l  W3  =  l  Vn  =  l 

In  addition  to  belonging  to  the  class  of  NP-complete  problems,  we  see 
that  this  formulation  requires  mn  x-variables  as  well  as  a  total  of  n(n  —  1) 
deviational  (<f+, d~)  variables.  There  are  nm  assignment  constraints  and 
constraints  to  ensure  that  m(m2  —  l)r,;/6  =  df}  —  d~.  Hence,  although 
this  formulation  is  the  most  straightforward,  we  will  present  other  APSEC 
formulations  which  have  more  reasonable  problem  size  growth. 
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To  develop  alternative  formulations,  we  use  the  objective  function 
lower  bound  of  (”J6/(m(m2  —  1))  when  m  =  2  +  4/  for  some  nonnegative 
integer  /,  and  zero  otherwise,  and  make  the  assumption  that  given  k  —  1 
column  permutations  with  minimum  52i=i  I  i»  it  is  possible  to  fix 
these  columns  and  find  an  optimal  kth  column  permutation.  Our  research 
and  empirical  results  have  shown  that  these  are  valid  assumptions. 

Suppose  we  have  a  solution  to  the  (k  —  l)-dimensional  problem,  and 
wish  to  use  this  solution  to  obtain  a  solution  to  the  fc-dimensional  problem. 
Let  (p*,p2, . . .  ,pk)  denote  the  corresponding  column  permutation  vectois, 
and  define 

1  if  the  ith  element  of  column  k 
Xij  =  is  assigned  value  j 
0  otherwise 

To  ensure  that  column  k  is  a  permutation  of  numbers  1  ...  m  ,  we  add  the 
assignment  constraints  : 


Y,X'J  =1  j  =  1 , . . . ,  m 

I 

YsX'J  ~  1  *  =*  m. 

i 


We  see  that  the  positive  elements  of  an  x-solution  define  a  kth  column.  We 
will  henceforth  interchangably  use  the  terms  “an  x-solution”  and  “the  ktfl 
column  defined  by  the  positive  elements  of  x”. 

There  are  (k  —  1)  additional  constraints  of  the  following  form: 


in  <a -o/6-es.i  17.1  *  =  i — 1 

where  p\  is  the  itk  entry  of  the  column  permutation  vector  pl.  With  these 
constraints,  we  implicitly  fix  the  (k  -  1)  previously  found  column  permuta¬ 
tions. 


The  formulation  defined  thus  far  with  objective  function 


rain  £  (4k  +  d.k), 

i=i 

is  a  general  formulation  for  finding  a  kth  column  permutation,  having  fixed 
the  (k  —  1)  column  permutations  that  minimize  £,><  I  ?ij  I-  Empirical 
evidence  strongly  supports  that  there  exists  a  kth  column  that  meets  the 
lower  bound  for  j  |,  i  =  1, . . . ,  k  —  1.  Hence,  there  will  exist  a  solution  to 
the  assignment  constraints  that  generates  a  kth  column  satisfying 

|  m(m2  -  l)r,fc/6  |  = 


j  mfm2  -  l)/6  -  Dl  | 


1  if  rn  =  2  +  4/,  /  €  Zf 
0  otherwise 


for  all  i  —  1, ...  k  —  1.  To  incorporate  this  into  the  formulation,  we  require 
that  d?k  and  djj.,  L  =  1, . . .  k  —  1  be  binary  variables.  For 
m  ^  6  +  4/,  l  €  2’i+,  any  solution  that  obtains  the  lower  bound  must  have 
d ^  +  d~k  =  0.  If  however,  m  =  2  +  4/,  /  £  Zf,  we  can  conclude  that 
+  djj.  =  1 ,  t  s=  1, . . .  k  —  1.  In  either  case,  the  problem  can  be  restated 
as  a  feasibility  problem  with  no  objective  function.  We  shall  refer  to  this 
feasibility  assignment  problem  with  side  equations  as  FASE. 

The  FASE  formulation  follows  the  conjecture  that  one  can  itera¬ 
tively  solve  fc-dimensional  problems  using  the  previously  determined  (k  —  1)- 
dimensional  solutions.  Thus,  rather  than  solving  one  large  APSEC  program 
with  mn  +  n(n  —  1)  variables  and  nm  +  (j)  constraints,  one  could  solve  a 
sequence  of  smaller  two-dimensional  FASE  problems  with  at  most  m2  -f  2k 
variables  and  2m  -f  k  constraints  (  2  <  k  <  n  ). 

In  the  presentation,  we  shall  discuss  heuristic  and  Lagrangean- based 
solution  procedures  developed  to  solve  the  MCLHS  problem  and  its  equiva¬ 
lent  formulation  FASE. 
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Abstract 

This  paper  describes  a  standard  for  the  use  of  GAMS  2.25  as  an 
object-oriented  modeling  language.  The  over-riding  benefit  of  using 
this  technique  is  the  ease  with  which  many  individuals  can  simulta¬ 
neously  develop  extraordinarily  complex  modeling  systems.  Lesser, 
but  still  important  benefits  include:  structured  user-interface  design, 
plug-in/plug-out  models,  isolating  portions  of  the  problem,  easy  main¬ 
tenance  and  updates,  and  model  re-use.  Simultaneous  model  devel¬ 
opment  stems  from  the  latter  benefits,  while  all  of  these  advantages 
derive  from  the  clear,  rigorous  organization  of  your  model  as  specified 
in  the  following  standard. 

We  present  the  concepts  of  encapsulation  (forming  objects)  and  hi¬ 
erarchical  modeling  in  the  context  of  mathematical  modeling.  Encap¬ 
sulation  is  a  well-known  programming  technique  that  is  newly  applied 
to  modeling,  and  our  version  of  hierarchical  modeling  differs  slightly 
from  past  notions.  Traditionally,  a  hierarchical  model  embodies  the 
concept  of  forming  larger  models  from  a  collection  of  sub-models.  The 
following  method  is  based  on  a  partition  of  the  relations  (equations)  of 
the  model,  where  the  elements  of  the  partition  are  partially  ordered. 


1  Overview 

Object-oriented  modeling  (00M)  is  a  method  of  modeling  that  closely  im¬ 
itates  object-oriented  programming  (OOP)  [?,?,?].  We  have  developed  a 
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standard  for  using  GAMS  2.25  [?]  as  an  00M  language.  The  difference  be¬ 
ing  that  the  00  models  are  much  more  structured  and  abstract.  This  makes 
them  more  user  friendly  because  their  use  is  well  defined  by  the  structure 
and  their  details  are  hidden  within.  00  Models  thus  appear  simpler  and 
more  uniform  to  the  user.  - 

Four  essential  properties  set  OOM  apart  from  standard  GAMS  2.25: 

Routines:  Structuring  the  assignment  statements  into  procedures  as  in  Pas¬ 
cal. 

Encapsulation:  Combining  data  and  variables  with  the  equations  and  as¬ 
signment  statements  that  manipulate  them  to  form  a  new  data  type — a 
model. 

Information  Inheritance:  Defining  a  model  that  uses  other  models  in  its 
formulation,  with  each  sub-model  inheriting  the  information  from  its 
ancestors.  The  use  of  models  within  models  defines  the  use  hierarchy 
which  forms  a  partial  ordering  of  all  used  models. 

Polymorphism:  Giving  a  model’s  routine  one  name  that  is  shared  by  all 
descendants  in  the  use  hierarchy,  with  each  descendant  implementing 
the  routine  in  a  way  appropriate  to  itself. 

Routines  are  implemented  using  the  SINCLUDE  statment.  Encapsulation, 
inheritance,  and  polymorphism  are  implemented  in  GAMS  2.25  through  self 
discipline.  The  following  is  a  detailed  discussion  of  the  principles  and  im¬ 
plementation  of  OOM  in  GAMS  2.25  through  self-discipline.  We  hope  that 
the  future  will  bring  the  language  extensions  need  for  a  proper  implementa¬ 
tion.  In  which  case,  the  standard  described  below  would  be  enforced  by  the 
compiler. 

There  are  now  a  variety  of  experimental  modeling  languages  offering 
object-oriented  features,  notably  ASCEND  [?]  and  MODEL. LA  (?].  We  of¬ 
fer  a  form  of  inheritance  that  differs  from  the  class  inheritance  of  standard 
OOP  and  OOM  languages.  This  is  an  extra  restriction  placed  on  our  models 
based  on  deferred  requirements,  and  the  use  of  models  within  other  models. 
Data  and  variables  are  legated  (passed  down)  to  the  descendants,  while  meth¬ 
ods  are  used  by  ancestors  to  ensure  that  deferred  information1  is  properly 
defined. 

1  Data  and  variables  that  have  been  declared  but  are  yet  undefined. 
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There  is  a  restricted  form  of  communication  control,  between  the  models  C 
the  use  hierarchy.  Essentially,  descendants  can  inspect  ancestor  infor.-  atio:; 
but  ancestors  can  o.u/  ask  that  certain  information  be  provided.  In  this  a  ay. 
siblings  communicate  through  the  parent,  and  its  deferred  information. 

We  further  expound  on  these  concepts  and  offer  a  full  accounting  of  the 
presentation.  First  we  introduce  a  model  and  how  it  is  encapsulated.  This 
leads  to  an  overview  of  traditional  hierarchical  modeling.  Then  we  exDiain 
how  OOM  fits  into  this  background.  The  final  section  gives  the  standard 
itself — how  to  implement  OOM  in  GAMS  2.25. 
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EXTENDED  ABSTRACT 


1.  Motivations  for  a  formal  theory. 

The  definition  of  a  specific  model  is  often  conceived  as  a  work  which  has  to  be  done 
from  scratch.  In  fact,  the  variety  of  the  variables  describing  the  modelled  reality  seems  to 
exclude  the  possibility  that  a  model  can  be  defined  assembling  pieces  of  correlated  sub¬ 
models.  i  o  define  models  from  scratch  greatly  decreases  the  productivity  of  the  work. 

It  seems  that  the  keyword  in  increasing  modelling  productivity  is  "reusability".  Models 
can  be  reused  and  integrated  so  to  produce  new  models.  Naturally  models  to  be  integrated 
have  to  be  expressed  using  a  common  base  and  the  result  has  to  lie  on  the  same 
framework.  In  this  paper  the  chosen  framework  is  the  Structured  Modeling,  as  formally 
defined  by  Geoffrion,  [3]. 

Here  we  define  three  integration  levels,  according  to  the  degree  of  influence  of  the 
operator  in  the  procedures  used  to  merge  the  models: 

Level  1  *  All  the  procedures  are  automated.  This  means  that  the  user  selects  the 
input  models  and  the  genera  to  be  integrated,  and  the  output  integrated 
model  is  automatically  produced. 

Level  2  -  The  user  selects  the  input  models  and  the  order  of  integration  among 
the  genera,  and  the  output  integrated  model  is  automatically  produced. 
Level  3  -  The  user  select  the  input  models,  the  genera  to  be  integrated  and 
formulate  the  steps  necessary  to  integrate.  The  output  integrated  model 
is  not  automatically  produced,  since  the  steps  can  vary  according  to  the 
situation. 

2.  Preliminary  results. 

In  the  rest  of  this  paper  we  assume  that  the  reader  is  familiar  with  the  formal  theory  of  the 
Structured  Modeling. 

Given  a  Structured  Model  Mi.  let  Gi  =  { gj,  j  =  1 . k>  be  the  set  of  all  the  genera: 

this  can  be  partitioned  into  three  disjoined  sets:  PC,  A  and  FT  such  that: 

PC  =  { gj  e  Gj:  gj  is  a  primitive  or  a  compound  entity  genus } 

A  =  { gj  g  Gj:  gj  is  an  attribute  genus  | 

FT  =  { gj  e  Gj:  gj  is  a  function  or  a  test  genus) . 

Lemma  1:  Any  genus  gj  e  PC ,  does  not  have  references  to  any  other  genera  gk  e  (A  v 
FT) 
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Proof:  Primitive  entity  elements,  by  definition,  have  no  calling  sequence,  therefore  they  do  not  have 
references  to  any  otherelemem:  compound  entity  elements,  by  definiuon.  are  construct  only  on  primitive 
entity  elements.  ■ 


Lemma  2:  Any  genus  gj  e  A,  has  only  references  to  another  genera  gg  e  PC,. 

Proof:  Attribute  elements,  by  definition,  characterize  only  primitive  and  compound  elements,  a 

Lemma  3:  Any  genus  g,  e  FT,  does  not  have  references  to  any  other  genera  gg  e  PC,. 

Proof :  Function  and  test  elements  call,  by  definition,  attribute.  function  and  test  elements;  therefore, 
they  cannot  call  primitive  and  compound  entity  elements* 

Definition  1:  Connected  Module,  Sub-Mode!. 

A  module  is  a  Connected  Module  if  its  genera  and  their  calling  sequences  define 
connected  graph.  A  Sub-model  is  a  connected  module  with  ai  leas:  one  primitive  emir, 
genus. 

Definition  2:  Behaviour  Equivalence  on  FT;  c.  FT*  . 

Two  structured  models  M i  and  Mi  are  Behaviour  Equivalent  on  FT /  £  FT;  anu 
FT2  d  FT2  if  the  following  two  conditions  hold : 

a )  The  set  A 1  of  the  attribute  genera  directly  or  indirectly  called  by  the  g,  e  FT  / ,  ana  me 
set  A  2  of  the  attribute  genera  directly  or  indirectly  called  by  the  gj  e  FT  >  have  the  same 
structure; 

b)  FT  1  and  FT 2  give  as  output  the  same  values. 

We  shortly  wme  “behaviour  equivalent  "  when  the  sub  set  FT,  coincides  with  FT,. 

Definition  3:  Normal  Model. 

A  model  is  called  normal  if  an  isomorphic  relation  exists  between  attribute  and 
compound  genera,  and  their  elements. 

The  graph  of  the  elements  of  a  normal  model  is  shown  in  figure  1;  dotted  rectangles 
identify  genera. 
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Proposition  1.  Given  a  Structured  Model  Mj,  it  is  always  possible  to  construct  a 
normal  model  NIM,- )  which  is  behaviour  equivalent  to  M, 

Proof:  Let  us  consider  a  generic  attribute  genus  gj  €  A,  c  Nlj.  It  is  always  possible  to  define  a  new 
compound  entity  genus,  ck  e  PCj,  with  the  same  calling  sequence  of  gj.  Lemmas  l  and  2  ensure  that 
genera  which  are  called  by  an  attribute  genus  can  be  called  by  a  compound  entity  genus  too.  An 
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isomocphic  relation  can  be  set  among  the  elements  of  gj  and  cy:  the  fust  element  of  gj  calls  the  first 
element  of  Cfc,  etc.  This  process  is  repeated  for  every  attribute  genus  of  Mi. 

If  we  indicate  with  N(M,)  the  modified  model,  the  set  B  =  |ck.  gj)  c  N(Mj)  «=s  gj  e  Mj  for  every  genus 
gk€  FT,c  N(Mj).  a 

Definition  4:  Index  Basis,  Index  Basis  Set. 

An  Index  Basis  of  a  normal  model  N(Mt)  is  a  couple  of  genera  Bj  =  {as,  c}),  where  at 
Ai  c  Mj  is  an  attribute  genus,  and  Cj  is  the  compound  entity  genus  called  by  ar  The 
genus  a.  is  called  value  component  of  Bj,  while  the  genus  Cj  is  called  index 

r. omponent .  The  set  BSj  =  (Bj,  j:l . k}  containing  all  the  index  basis  of  N(Mi)  is 

died  Index  3asis  Set. 

D^ftniiion  5:  Index  Function. 

\n  Index  Function  if  gj)  is  a  rule  which  associate  to  every  genus  gj  e  N(M,)  the 
cardinality  of  its  index  set. 

As  example,  given  a  genus  gj  indexed  by  j  x  *  x  / ,  its  index  function  i(gi)  returns  as  value  3. 

3.  Main  results. 

in  this  caragraph  we  try  to  give  an  example  for  each  level  of  integration  previously 
defined. 

3.1 .  l^evel  1  integration  example. 

To  show  the  first  level  of  integration  we  need  to  introduce  the  definition  of  a  function 
submodel. 

Definition  6:  Function  Sub-Model. 

SubM(f)  is  called  Function  Sub-model  if  the  following  properties  hold: 

a)  SubM(f)  is  a  normal  model. 

b)  SubM(f)  has  at  least  a  function  genus  fe  FT,  cM{  indexed  as  singleton. 

In  the  following  we  give  a  procedure,  which  transforms  a  Structured  Model  Mi  with  at 
least  one  function  genus  indexed  as  singleton  into  a  function  sub-model. 

The  following  procedure,  CREATE_FUNCTION_SUBMODEL.  needs  as  input  a 
model  Mi  and  a  singleton  genus  f  e  FTj  c  Mi,  and  produce  as  output  a  function 
sub-model.  The  proof  of  this  is  given  in  Proposition  2. 

orocedure  CREATE.FUNCTION_SUBMODEL  (input:  Mj.  f;  output:  SubMIfl); 

/*  Modify  Mj  into  a  function  sub-model  SubM(£)  */ 

begin 

/*  step  X.  -Normalize  the  model'  '/ 

NORMAL  (Mj  I  ; 

/'  atep  XX.  -Merge  functions'  '/ 

Create  a  LIST  of  calling  sequence  segments  Sj  of  f; 

repeat; 

Examine  the  segmenc  Sj  e  LIST; 
if  the  referred  genus  qy  e  FT, 

then 

/•  a  */  Substitute  Sj  with  the  calling  sequence  of  g^  ; 

/'  b  */  Substitute  the  value  field  of  qy  with  its  rule: 

/'  c  •/  Delete  g^; 

Delete  the  segment  sj; 
until  (end  of  LIST); 
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/*  »tap  III  'Delete  genera  ravin?  no  influence-  jr,  t'  *, 

Create  a  LIST  of  gj  €  M.  : 

repeat 

Examine  g ,  €  LIST: 

if  (g-i  €  FT j  end  ,1,  *  fi 

than  delete  gj; 

If  (gj  €  Aj  L  PC,  end  g.  is  not  called  directly  or  indirectly  by  (: 

than  delete  a-,; 

until  (there  are  nc  more  a,  e  FT,  c  M,_.  g,  »  ti  end  (there  are  nc  more  g.  t 
Aj  u  PC  I  C  M  j  not  called  directly  cr  indirectly  by  £: 

and 

Proposition  2:  Given  a  Structured  Model  M,  and  an  arbitrary  singleton  function  genus 
fe  FT j  c  Mj,  it  exists  a  transformation  T  such  that: 

T(M)  =  SubM(f ) 

and  SubMif)  and  Mi  are  behaviour  equivalent  onf. 

Proof:  By  applying  procedure  C REA TE  FUNCTION.  SUB  MODEL  which  defines  tnc  procedure  T  m 

Let  us  show  how  a  function  genus  f  can  be  reused  as  an  input  parameter  for  other 
models.  This  action  is  totally  automated,  here  is  an  example. 

Suppose  we  have  two  models  Mj  and  M2,  we  want  to  substitute  the  genus  gj  e  A|  c 
Mj  with  the  computed  value  given  by  the  genus  f  e  FTo  c  M2-  This  goal  is  achieved 
applying  the  following  procedure  (the  symbol  [Mi,  SubMo]  means  the  integrated  output 
model): 

procadura  REUSE  (input:  M; ,  M;.  g . .  1:  output:  [M; .  SubMo > ( ; 

/•  Integrate  Mi  and  M; .  f  is  substituted  to  g,  • 

basin 

/*  snap  I  'Changes  ir.  M* '  * 

CREATE. FUNCTION.SUBHODEL  (M: . f ;  SubMt ifll: 

Create  a  LIST  of  genera  g!  €  A;  C  SubMo ( f ! ; 

r apaac 

Add  the  calling  sequence's  segments  cf  g  ■,  e  A;  to  the  calling  sequence 
gt  €  A;  C  SubMa ( f : ; 
until  end  of  LIST. 

/*  Step  II  "Changes  in  M_"  * 

Create  a  LIST  of  genera  a,  e  FT  c  L (Mi: 
repeat 

Select  g.  from  LIST; 

If  gj  calls  g^  €  Aj 

than 

Substitute  gj  with  £  in  the  calling  sequence  of  gj; 

LIST  :  --  LIST  -  3,  ; 
until  and  of  LIST 

/*  Step  III  "Delete  attribute  genus"  */ 

Delete  g ■  €  A  j : 

and  . 


Proposition  3:  Given  two  Structured  Models  M /  and  Mi,  it  is  always  possible  >o 
substitute  an  attribute  genus  g,  e  Ay  c  M  /  with  a  singleton  function  genus  f  e  FT 2  c 
Mi  The  result  is  a  Structured  Model. 

Proof:  By  applying  (he  procedure  REUSE  we  obtain  as  result  the  model  {Mj.  SubM2).  Us  graph  of 
genera  must  be  (mile,  closed  and  acyclic. 

a)  Finiteness.  Step  III  guaranties  that  the  number  of  genera  of  (Mj,  SubM2)  is  equal  to  the  number  of 
genera  of  (Mj  \j  SubM2(f))  •  gj 
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b)  Closure.  By  steps  I  and  (1.  there  is  at  least  one  genus  of  Mj  calling  a  genus  of  SubM2<f)  and  at  least 
cne  genus  of  SubM^vO  calling  a  genus  of  Mj.  From  closure  of  Mj  c  SubM2(D  it  follows  the  closure  of 
{Ml.  SubMn]. 

c)  Acyclicity.  Let  us  consider  an  arbitrary  sub-set  of  genera  Gj  c  [Mj,  SubM2l.  and  let  us  assume  that 
it  is  cyclic.  Therefore.  G,  contains  genera  belonging  to  both  models,  because  no  new  references  are  set  by 
the  procedure  among  genera  belonging  to  only  one  model.  By  construction,  the  sequence  must  be  of  the 
type: 

{  ....  a.  €  A2  c  SubM2(f).  f _  )■ 

The  genus  following  f  in  the  sequence  has  to  be  a  function  genus,  while  the  genus  preceding  a,  has  to  be 
a  compound  entity  genus.  By  Lemma  3  there  are  no  references  among  function  genera,  and  compound 
entity  genera.  Therefore.  G,  cannot  by  cyclic.  ■ 

Figure  2  shows  how  two  models  are  integrated. 


Proposition  4.  Given  two  normal  models  Mj  and  M2,  the  integrated  model  obtained 
substituting  an  input  parameter  g  \  e  A/  cM;,  with  an  output  parameter  gi  e  FT 2  c  M2 
is  a  Structured  Modei  ifi(gi)  =  i(g2)- 

Proof:  it  follows  the  same  line  of  proposition  3.  The  necessary  condition  given  by  ihe  equality  of  the 
index  functions  ensures  the  closure  and  acyclicity  of  the  graph  of  the  elements.* 

Given  the  result  of  proposition  4  the  following  procedure  can  be  constructed.  The 
input  parameters  are  the  two  normal  models,  an  index  basis  of  the  model  Mj  and  a 
function  genus  of  the  model  M2. 

procedure  USE  (Input:  N(M|t.  N<M2>.  BA,  f.  Output:  { N ( H i .  N(M2>I): 

begin 

Select  €  Bj ; 

Compute  i i (aj ) ; 

Compute  13 (f); 

if  i^(a[)  *  13(f)  then  exit; 

Create  a  LIST  of  genera  gx  €  FTj  C 

repeat 

Select  g\  from  LIST; 

If  g  ^  has  a  reference  to  aj  then 

Substitute  the  reference  to  aj  with  a  reference  to  f; 

LIST  :=  LIST  -  gx; 
until  end  of  LIST; 

Delete  Bj; 

end . 
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The  following  steps  create  an  integrated  model,  which  is  the  same  result  as  in 
Geoffrion.  The  final  graph  of  genera  obtained  applying  sequentially  step  0  -  step  IV  is 
shown  in  figure  4. 

Stop  o. 

NORMAL  (fin I 
NORMAL  (mktl 
NORMAL  (mar) 

NORMAL  (mfgi 

Stop  I. 

SUBSTITUTE  (mkt,  mar,  (P.Dll.  IP.D2M; 

USE  (mar,  mat ,  ( V. 03 [ .  VI; 

Stop  II. 

USE  (mfg,  mar.  (V.05),  V) ; 

USE  (mar,  mfq.  ;E,D4),  El; 

Stop  III. 

SUBSTITUTE  (fin,  mar.  [P.D61.  [P.D21I; 

USE  (fin,  mar.  (E.D81 ,  E) ; 

USE  (fin.  mar.  (V.09),  VI: 

stop  IV. 

MERGE  (mfq.  mar.  P.  Ul ; 


Figure  4 

33  Level  3  integration  example. 

At  this  level  of  integration  the  user  needs  to  define  the  steps  to  integrate  the  models,  and 
there  are  no  automated  procedure.  Let  us  present  another  example  extracted  from 
Geoffrion  {4|.  The  steps  are  informally  defined,  since  the  user  will  formalize  them. 

Stop  1 

Delete  DEM  and  T:0EM  genera  from  TRANS  1 
Delete  SUP  and  T:SUP  genera  from  TRANS2 

Stop  II 

Merge  genus  OUST  from  TRANS2  with  genus 
PLANT  from  TRANS2 ; 

Stop  III 

Create  new  genera  T:DC  and  define  its 
reference ; 

stop  iv  (Optional) 

Create  a  r.ew  genus  TOTS  being  the  sum  of 
the  TOTS  function  genera  of  the  two 
models ; 

Stop  V  (Optional) 

Rename  genera;  rigUte  J 

4.  Conclusions. 

The  first  remark  about  the  definition  of  a  formal  theory  to  models’  integration  is 
modularity.  This  can  be  easily  achieved  projecting  the  theory  of  the  Structured  Modeling 
into  the  same  space  of  the  Object  Orientation  Principles. 

The  second  remark  regards  the  construction  of  three  sub-sets  which  contain  the 
procedures  characterizing  the  formal  rules  of  the  three  integration  levels. 

Both  aspects  will  be  deeper  developed  in  the  future. 
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32.  Level  2  integration  example. 

In  this  case  the  role  of  the  user  is  relevant,  since  the  input  parameters  to  be  merged  are 
only  identified  by  him. 

Proposition  5.  Given  two  normal  models  N(Mj)  and  N(M2)  and  the  corresponding 
index  basis  set  BS/  and  BS2,  the  integrated  model  obtained  substituting  in  N(Mi),  Bje 
BSj  with  Bge  BS2  is  a  Structured  Model. 

Proof:  To  substitute  Bj  with  implies  that  every  genus  gj  a  FT)  has  to  replace  the  reference  in  its 
calling  sequence  to  aj  €  Bj  whit  a*  6  B^.  The  graph  of  genera  of  the  integrated  model  has  to  be:  (a) 
finite;  (b)  closed  and  (c)  acyclic,  (a),  (b)  hold  by  construction;  (c)  hok.  ;;.y  lemma  2m 

Proposition  6.  Given  two  normal  models  N(Mi)  and  N(M2),  and  the  corresponding 
index  basis  set  BSi  and  BS2,  the  integrated  model  obtained  substituting,  in  N(Mi),  an 
index  component  Cj  e  Bj  e  BS /  with  an  index  component  ege  B^e  BS2  is  a  Structured 
Model 

Proof  :  It  follows  the  lines  of  proposition  5.« 

Given  the  results  of  the  proposition  5  and  6  the  following  procedures  can  be 
constructed. 

procedure  SUBSTITUTE  (Input:  NiMil,  N  M2).  5 \ .  &2 : 

Output:  IN(Mj),  N«M2>I>; 

begin 

Create  a  LIST  of  genera  gx  €  FT 1  c  Mi ; 

r •p««c 

Select  gx  from  LIST 

Substitute  A 1  €  Bi  with  ki  €  Bt  in  the  calling  sequence  of  gv; 

LIST  : =  LIST  -  gj; 
until  end  or  LIST: 

Delete  Bj; 

and  . 

proetdur*  MERGE  (Input:  N<Mj),  NIM^).  Bj  ,  Bn  r 
-utput:  { *  M 1  >  ,  !  1 1 M  2  *  J  1  - 

btgin 

Select  Cj.  ai  €  Bj.  C2  €  Bt; 

Substitute  ci  with  C2  in  the  calling  sequence  of  aj; 

•  nd  . 

In  the  next  we  treat  the  core  example  extracted  from  Geoffrion  [4],  The  sub  models  to 
be  integrated  are  shown  in  figure  3  (the  details  are  omitted): 


fa  mkt  mar  mfg 

Figure  3 
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EXTENDED  ABSTRACT 


Introduction 

As  pointed  out  by  many  authors,  a  Model  Management  System  (MMS)  provides  for 
creation,  storage,  manipulation,  and  access  to  models.  MMS  functions  can  be  divided  in 
two  main  groups:  Model  storage  functions  and  Model  manipulation  functions.  The 
former  includes  Model  Building,  Model  Representation,  physical  and  logical  Model 
Storage  and  Model  Retrieval;  the  latter  includes  Model  Instantiation,  Interface  with 
Databases,  Model  Maintenance,  Links  between  model  and  Algorithms,  and  Model 
Solving. 

Model  representation  schemes  plays  a  key  role  in  the  implementation  of  effective 
MMSs.  To  fully  implement  the  functions  of  MMSs,  we  need  to  state  a  rigorous 
conceptual  framework  with  a  single  model  representation  leading  to: 

1)  independence  of  model  representation  and  model  solution. 

2)  representational  independence  of  general  model  structure  and  detailed  data 
needed  to  describe  specific  model  instances. 

A  system  based  on  these  ideas  would  show  its  usefulness  for  most  phases  of  the 
life-cycle  associated  with  model-based  work  (Geoffrion  1987).  For  example,  consider  a 
mathematical  programming  problem.  Once  a  model  of  this  problem  has  been 
constructed,  a  MMS  should  allow  the  user  to  perform  the  following  steps: 

1)  select  the  solution  technique  (if  any), 

2)  soive  the  model, 

3)  conduce  sensitivity  analysis. 

To  automate  steps  1  and  2,  the  system  has  to  be  able: 

a)  to  recognize  what  kind  of  model  arises  (so  that  it  could  automatically  select 
the  appropriate  solver); 

b)  to  translate  data  instantiating  the  model  (querying  the  Database  where  they 
are  stored)  into  the  format  required  by  the  selected  solver. 

This  paper  will  focus  on  the  model  recognition  phase.  We  will  try  to  give  its 
theoretical  foundations  and  to  define  which  conditions  a  model  definition  language  has 
to  satisfy  so  that  the  resulting  representation  is  "recognizable" 

Our  formalization  of  the  recognition  process  is  based  on  the  concept  of  “minimal 
representation”.  A  representation  of  a  model  is  minimal  if  any  other  equivalent 
representation  of  the  same  model  can  be  “reduced”  to  it. 
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2.  Model  Recognition  Problem:  Preliminary  Resuits. 

The  aim  of  this  section  is  to  provide  for  some  formal  definitions.  In  the  next,  we  wiii 
use  them  to  illustrate  how  recognition  process  can  be  earned  out 

The  recognition  process  we  are  trying  to  formalize  r  based  on  the  concept  of 
minimal  representation.  A  representation  of  a  model  is  minimal  if  any  otner  equivalent 
representation  of  the  same  model  can  be  reduced  to  it. 

in  the  rest  of  this  paper  we  will  define  and  explain  minimality,  equivalence  tins' 
reduction  of  model  representations;  first  we  need  to  define  wr.at  we  intend  for  ‘  model’ 
and  “model  representation”. 

Definition  1 

We  define  the  system  M  to  be  a  model  of  the  system  P  it: 

—  M  does  not  interact  neither  directly  nor  indirectly  with  0 
—  M  is  used  to  obtain  information  about  P 

—  M  comprises  all  the  elements  of  P  relevant  tor  the  intended  purpose  ol  tne  mooei 


Definition  2 

Given  a  formal  language  L  and  a  model  Mv  we  define  L(M,)  to  be  its  iormai  representation  oncer  t , 
if  it  comprises  the  expression  in  language  L  of  aU  the  elements  of  M,  ana  of  the  interactions  existing 
among  them 

In  the  following  we  will  use  the  terms  “model  representation”  or  simply 
“representation”  to  indicate  the  “formal”  representation  of  a  given  model  under  some 
formal  language. 

Let  us  consider,  as  an  example,  model  M,  as  the  model  of  the  system  P  that  computes 
the  mean  of  a  given  series  of  values  belonging  to  P;  if  L  is  the  standard  algebra:' 
notation,  then  L(M,)  will  be: 


n 


mean  =  —  [1] 


If  L(Mj)  exists  and  is  unique,  then  the  recognition  problem  has  a  trivial  solution, 
because  there  is  a  1:1  correspondence  between  model  and  its  representation. 
Unfortunately,  except  for  very  few  cases,  the  model  M,  has  many  representations 
Lj(Mi),  j=l,  ....  n,  n>l.  Referring  to  the  previous  example,  two  other  ways  to  represent 
the  same  model  are  the  following  ones: 
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sum 


mean 


sum 

n 


[2] 


z 


result  =  — - -  [3] 


It  is  intuitive  that  all  previous  representations  are  equivalent,  in  so  far  as  they  “do  the 
same  thing".  Nevertheless,  for  our  purposes  we  need  a  more  ngorous  definition  of 
equivalence  based  on  the  concept  of  ‘transformation  rule”. 

V/e  can  think  to  a  transformation  rule  as  to  a  iuncuon  or  procedure  whose  input  is 
the  whoic  modei  representation  or  a  part  of  it.  and  whose  output  is  a  new  model 
representation  or  a  part  ot  it.  Obviously,  the  output  of  a  transformation  rule  must  be 
semanticaily  consistent  with  its  input.  Let  us  give  its  formal  definition: 

Definition  3 

Consider  ?.  rormal  language  t  and  two  distinct  sets  E.  ana  E2  of  expresfons  of  L  semantically 
.dentical.  Let  R  be  the  set  of  all  transformation  rules  defied  on  L:  r  e  R  is  defined  to  be  a 
transformation  rule  on  l  rf  applied  to  E,  transforms  rt  rto  E? 

The  existence  of  transformation  ruies  is  very  important  to  state  formally  the 
equivalence  of  model  representations.  Two  equivalent  representations  must  be 
semantically  identical:  in  other  words,  there  exist  two  (sets  of)  transformation  rules  that 
transform  one  into  tr.e  other,  and  viceversa.  We  can  formalize  the  equivalence  between 
model  representations  as  follows: 

Definition  4 

Let  S|_=  ( Lj(M J:  |=1 n;  n>  1 }  be  the  set  of  ad  possible  representation  of  M,  in  (he  language  L  Two 

representations  L^MJ  e  SL ,  j*fc,  are  defined  to  be  equivalent  if  (here  are  two  sets  of 

transformation  ruies,  R(  and  R?,  defined  on  L  such  that  Ri  applied  to  LjfWJ  transform  it  into  L^MJ. 
and  R2  applied  lo  L(MJ  transform  it  into  Lj(M^.  If  R,  =  R2  then  the  two  representations  are  defined 
Identical.  Obviously,  identical  representations  are  also  equivalent. 

As  an  example,  let  us  consider  two  iransformauon  rules,  called  split  and  join  suitable 
to  be  applied  to  representations  [1)  and  (2).  The  terms  LHS  and  RHS  stand  for 
respectively  “left  hand  side”  and  “right  hand  side”. 


transformation  split 
input 

induction  type  fraction 
output 

r,ut_assignment  type  assignment  siatemem 
cutfracuor.  type  fraction 
begin 

set  RHS  of  in_assigmnent  to  numerator  of  infraction 
set  numerator  of  out  fraction  to  LHS  of  out  assignment 
set  denominator  of  out_fraction  to  denominator  of  infraction 
end 
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transformation  join 

Input 

in_«ssigmnent  type  assignment  statement 
m_fracuon  type  fraction 
output 

out_£raction  type  fraction 
begin 

if  LHS  of  in_assignmem  =  numerator  of  m_fi»ction  then 
exit  join 

set  numerator  of  out  fracuon  to  RHS  of  in  . assignment 
set  denominator  of  out.ffactton  to  denominator  of  in  fraction 

end 

The  rule  split  performs  the  following  operations:  given  a  fraction,  it  reads  ns 
numerator  and  assigns  it  to  an  intermer'iate  variable,  and  t.._.i  it  builds  another  fraction 
whose  numerator  and  denominator  arc.  respectively  the  intermediate  variable  and  the 
denominator  of  the  given  fraction. 

The  rule  join  acts  as  follows:  given  an  assignment  statement  and  a  fraction  whose 
numerator  is  the  variable  on  the  left  hand  side  ot  the  assignment  statement,  it  builds  a 
new  fraction  whose  numerator  and  denominator  are  respectively  the  right  hand  side  of 
the  assignment  statement,  and  the  denominator  of  the  of  the  given  fraction 

Since  we  can  transform  representation  (1J  into  representation  (2|  and  vice  versa  by 
applying  respectively  transformation  rules  split  and  join ,  they  are  equivalent  in  the 
sense  expressed  in  Definition  i.  They  are  not  identical,  since  transformation  ruies  we 
need  to  appiy  are  different. 

Let  us  now  consider  a  third  rule,  called  rename,  which  renames  all  the  eiements  of  a 
model  definition,  or  a  pan  of  them,  subiect  to  the  simple  constraint  that  ail  elements 
with  identical  name  in  the  input  model  representation  must  have  identical  name  in  the 
output  one.  Model  representation  ’3]  is  one  of  the  possible  results  of  applying  rule 
rename  to  [  I }.  Since  transformation  rule  we  need  to  apply  to  transform  representation 
[1]  into  representation  (3|  and  vice  versa  is  the  same,  they  are  identical. 

3.  Mode!  Recognition  Problem:  Basic  Ideas. 

As  asserted  in  first  section  of  this  paper,  our  main  task  is  to  determine  which 
conditions  have  to  be  satisfied  so  that  the  recognition  of  a  model  can  be  performed.  For 
this  purpose,  we  state  that  the  language  L  must  allow  that  the  set  of  model 
representations  it  produces  can  be  ordered  by  rank.  The  rank  is  a  measure,  defined  on 
some  measurable  aspect  of  L,  which  allows  to  class  and  order  model  representations. 
We  formalize  that  as  follows: 

Definition  5 

A  lormai  language  L  satisfies  the  property  ot  rankaWlity  it 

—  all  model  representations  LjfM,)  e  St.  are  equivalent, 

—  all  model  representations  Lj(MJ  e  SL  can  be  ranked 

—  Sj.  can  be  partitioned  by  rank  and  all  the  elements  in  the  same  cell  ot  the  partition  are 

identical. 

In  previous  examples  we  might  consider  the  number  of  equations  as  rank.  If  so.  then 
representation  (1]  and  [3]  are  of  rank  1  whiie  representation  (2j  is  of  rank  2.  Since  ail 
representations  are  equivalent,  and  representations  (1]  and  [3]  are  identical,  then  the 
property  of  rankabiiiiy  holds. 

Now,  let  us  explain  how  recognition  process  can  be  carried  on.  To  recognize  a  model 
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representation  means  that  we  have  to  determine  the  model  it  represents.  The  basic  idea 
of  this  process  is  to  transform  the  representation  to  recognize  into  another  one  that  we 
know  the  kind  of  model  it  represents.  So  doing,  we  have  “recognized’’  the  model. 

If  the  representation  we'  deal  with  are  expressed  in  a  language  L  satisfying 
rankability,  then  all  representations  of  the  same  model  are  equivalent.  So,  if  we  know 
all  transformation  rules  that  language  L  allows,  then  we  can  recognize  any 
representation  simply  transforming  it  into  the  known  one  by  applying  to  it  the 
appropriate  transformation  rules. 

The  set  of  all  transformation  rules  may  be  incredibly  large  or  even  not  finite.  This 
fact  can  influence  the  efficiency  of  the  recognition  process.  The  recognition  process  can 
be  carried  out  more  efficiently  if  it  is  based  on  the  ideas  of  “minimal  representation" 
and  of  “reduction  ruie”  defined  as  follow: 

Definition  S 

A  model  reoresentation  UMJ  is  delmed  to  be  minimal  ii: 

—  L  satisfies  the  property  of  ranxabdity; 

—  it  has  the  lowest  possible  rank 


Definition  7 

Given  a  language  L  satisfying  rankability,  reduction  rules  are  defined  to  be  transformation  rules 

which  when  applied  to  a  model  representation  L*(MJ  e  S^of  rank  k  produce  a  model  representation 

Lj(M)e  Si_ofrankj<k. 

Referring  to  previous  example,  we  can  consider  representations  (1)  and  [2]  as 
minimal  ones. 

Property  of  rankability  plays  a  crucial  role  for  our  purposes;  in  fact,  if  L  satisfies 
rankability,  all  reduction  rules  are  known,  and  they  form  a  finite  set  then: 

—  it  always  admit  a  minimal  representation  (i.e.  a  representation  which  has  the 
lowest  possible  rank,  and  to  which  any  other  representation  of  the  same  model 
can  be  reduced); 

—  any  model  representation  in  language  L  can  be  reduced  in  its  minimal  form 
(by  applying  to  it  the  appropriate  reduction  rule  until  no  more  rule  can  be 
applied); 

—  all  minimal  representations  of  the  same  model  are  identical. 

Under  the  above  mentioned  condition,  the  recognition  process  of  a  given  model 
representation  can  be  based  on  the  minimal  representation  by  performing  the  following 
basic  steps: 

1)  reduce  the  model  representation  to  recognize  to  its  minimal  form; 

2)  search  among  the  “known”  minimal  model  representation  for  a  template 
matching  the  minimal  representation  obtained  by  step  1 . 

Since  for  any  given  language  L,  the  set  of  the  reduction  rules  must  necessarily  be  a 
subset  of  the  set  of  the  transformation  rules,  the  recognition  process  of  a  given  model 
based  on  the  minimal  representation  is  more  efficient  than  the  previous  one. 

Now,  we  can  define  formally  the  condition  under  which  a  given  model  definition 
language  generates  “recognizable”  model  representations; 
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Definition  8 

A  model  representation  is  defined  to  be  recognizable  if  the  recognition  process: 
—can  be  based  on  its  minimal  representation 
—  can  be  performed  in  a  finite  number  of  steps. 


Claim  1 

A  formal  model  definition  language  L  generates  ‘recognizable-  model  representations  if: 

—  it  satisfies  ffie  property  ot  rankability. 

—  the  set  of  alf  reduction  rules  it  admits  is  finite. 


Proof: 

If  l  satisfies  property  of  rankability  then  it  always  admit  a  mimmai  representation  if  the  set  of  the 
reduction  rules  is  finite  any  model  representation  can  be  reduced  to  its  minimal  form  in  a  finite 
number  of  steps.  In  this  way  both  the  conditions  which  state  the  recognzability  of  a  model 
representation  are  satisfied.  ■ 


3.  Conclusions 

Here  we  have  sketched  the  fundamental  lines  to  "recognize”  models  representation. 
It  seems  to  us  that  the  idea  of  minimality  looks  very  promising  to  be  further 
investigated. 
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Extended  Abstract 

This  paper  describes  a  simple  methodology  for  reasoning  about  temporal 
and  precedence  constraint  satisfiability  problems  arising  in  job  scheduling.  In 
particular,  a  Constraint  Satisfaction  Problem  (CSP)  approach  is  presented. 

Several  researchers,  coming  both  from  Artificial  Intelligence  (AI)  and 
Operations  Research  (OR)  have  investigated  methods  for  dealing  efficiently 
with  time  (see,  e.g.,  [2, 3,  7, 12]);  however,  at  least  to  the  author’s  knowledge, 
oniy  very  few  real  and  large  scale  scheduling  applications  have  been 
approached  using  this  relatively  new  technique  [4]. 

In  this  paper,  among  all  the  job  scheduling  problems,  an  application  in 
which  a  set  V  of  n  jobs  has  to  be  processed  on  a  single  machine  is  considered, 
such  that  a  release  date  ri,  a  deadline  di  and  a  process  time  pi  are  associated 
with  each  job  i  E  V.  The  problem  is  formulated  on  a  constraint  network,  i.e., 
a  digraph  G  =  (V,  A)  of  n  nodes  (jobs).  An  arc  (ij)  E  A  means  that  job  j  can 
be  processed  immediately  after  job  i.  A  weight  pj  and  the  attributes  rj  and  dj 
for  each  node  j  E  V  are  given.  Moreover,  a  digraph  P  =  (V,E),  with  E  C  A, 
is  given  such  that  an  arc  (i  j)  6  E  represents  a  precedence  constraint  between 
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jobs  i  and  j.  The  problem  consists  of  determining  the  starting  time  for 
processing  the  jobs  in  V  such  that  the  time  windows  (defined  by  rj  and  dj)  for 
scheduling  the  execution  of  each  node  (job)  is  satisfied  and  the  precedence 
constraints  between  nodes  given  by  the  relationships  defined  in  arc  set  E  are 
satisfied  within  a  time  horizon  (production  plan). 

Based  on  the  Allen’s  model  for  temporal  logic  [1],  a  CSP  formulation  is 
first  presented.  A  CSP  consists  of  a  set  of  variables  X  =  {xi,  X2, xn  },  their 
associated  domains  Di,  D2, ...,  Da  and  a  set  C  of  constraints  on  these  variables. 
A  solution  to  a  CSP  consists  of  an  instantiation  of  ail  the  variables  which  does 
not  violate  any  of  the  constraints.  In  the  case  of  the  application  considered  in 
this  paper,  let  X  be  the  set  of  variables  such  that  xi  represents  the  starting  time 
for  processing  job  i,  V  i  6  V.  A  domain  Di  is  associated  with  each  variable  xi 
such  that  Di  =  { set  of  available  Time  Machine  Units  (TMUs)  for  processing 
job  i  (production  plan)  }.  The  set  C  of  constraints  is  defined  by  two  classes  of 
constraints,  namely  Ci  and  C2,  such  that  C  =  C1UC2,  Ci  =  {  unary  constraints 
(time  interval)  }  =  {fiViEV}U{diVieV}  and  C2  =  {  binary  constraints 
(precedences)  }  =  {  (ij)  £  E  },  The  problem  is  to  verify  whether  an 
instantiation  of  all  the  variables  is  possible  such  that  all  the  jobs  are  completed 
within  their  time  interval  and  no  precedence  relationship  is  violated. 

Starting  from  the  Allen’s  interval  algebra,  the  temporal  relations  are 
specified  by  atomic  relations.  In  particular,  for  each  pair  i  j  of  jobs  the  following 
atomic  relations  are  defined: 

-  After(j,i ):  this  specifies  the  precedence  relationship  between  i  and  j,  i.e., 
(U)  €  E; 

-  Availdble(i,n,Di):  this  specifies  the  release  date  of  job  i  within  the 
production  plan; 


-  Due(i,di,Di):  this  specifies  the  deadline  of  job  i  within  the  production 
plan. 

The  constraint  network  G  of  this  problem  is  then  "preprocessed"  such  that 
to  compute  the  tightest  possible  bound  for  both  unary  and  binary  constraints 
on  the  jobs.  In  particular,  given  the  explicit  precedence  relationships  between 
jobs  the  possibility  of  inferring  additional  implicit  precedence  relationships  are 
explored;  for  instance,  the  transitivity  of  the  predicate  After(j,i)  may  allow  to 
infer  information  such  that 

-  After(j,k)  H  After(k,i)  -*  After(j,i)  . 

Moreover,  the  availability  interval  of  each  job  within  the  production  plan 
is  computed  by  considering  its  release  date,  deadline  and  precedence 
relationships.  The  new  domain  Di’  for  each  job  i  in  V  is  hence  computed  such 
that  the  predicate 

-  Di’  =  Interval(i,n,di)  =  Di  fl  Available(i,n,Di)  D  Due(i,di,Di) 

returns  the  restricted  time  interval  in  which  each  job  has  to  be  processed 
in  order  to  obtain  a  feasible  scheduling  of  the  jobs.  Note  that  all  the  possible 
instantiations  of  the  corresponding  variables  are  thus  noticeably  reduced  after 
the  computation  of  Di’,  V  i  €  V. 

It  is  worth  mentioning  that  such  a  preprocessing  approach  allows  for 
further  generalization  of  the  proposed  scheduling  problem;  for  instance,  it 
could  be  necessary  to  take  into  account  a  possible  decomposition  of  the  jobs 
into  different  subtasks  [13],  to  analyze  periodic  scheduling  problems  [9]  or  to 
consider  setup  times  between  jobs  [5]. 
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A  Prolog-like  algorithm  is  then  presented  for  finding  a  consistent 
assignment  for  the  variables,  i.e.,  an  instantiation  of  all  the  variables  which  does 
not  violate  any  of  the  constraints  given  by  both  Ci  and  C2.  In  particular,  the 
procedure 

-  xi  =  Assign  (i,Di) 


associates  a  value  in  the  new  domain  Di’  with  the  corresponding  variable 
xi,  such  that  a  feasible  starting  time  for  processing  job  i  is  given. 

In  this  phase,  following  the  most-constrained  approach  suggested  in  [12], 
the  job  having  the  tightest  constraints  is  selected  first.  In  particular,  the 
procedure 

-  Preorder(X) 

performs  a  sort  of  the  set  of  variables  in  such  a  way  that  the  most  critical 
job,  i.e.,  the  most  constrained  job,  is  chosen  first  for  its  instantiation. 

In  this  particular  application  the  most  constrained  path  is  proven  to  be  the 
most  efficient  implementative  approach,  in  the  sense  that  the  number  of 
backtrackings  is  minimized  (see,  e.g.,  [6, 7]  for  an  overview  of  the  complexity 
of  this  kind  of  temporal  CSP  problem). 

Note  that  a  different  way  for  finding  a  feasible  instantiation  of  all  the 
variables  is  to  look  for  an  initial  solution,  possibly  inconsistent,  and  then 
incrementally  repair  constraint  violations  until  a  consistent  assignment  is 
achieved.  Such  an  approach  is  proposed  in  [10]  in  the  case  of  scheduling 
problems  without  precedence  and  time  window  constraints. 
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The  application  field  and  computational  experiences  related  to  real-life 
cases  are  also  given  in  the  full  paper.  Some  conclusions  along  with  a 
comparison  with  a  more  traditional  mathematical  programming  approach  (see, 
e.g.  [5,  8])  for  solving  the  scheduling  problem  under  consideration  are  finally 
derived. 
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